Research on Joint Resource Allocation for Multibeam Satellite Based on Metaheuristic Algorithms

With the rapid growth of satellite communication demand and the continuous development of high-throughput satellite systems, the satellite resource allocation problem—also called the dynamic resources management (DRM) problem—has become increasingly complex in recent years. The use of metaheuristic algorithms to obtain acceptable optimal solutions has become a hot topic in research and has the potential to be explored further. In particular, the treatment of invalid solutions is the key to algorithm performance. At present, the unused bandwidth allocation (UBA) method is commonly used to address the bandwidth constraint in the DRM problem. However, this method reduces the algorithm’s flexibility in the solution space, diminishes the quality of the optimized solution, and increases the computational complexity. In this paper, we propose a bandwidth constraint handling approach based on the non-dominated beam coding (NDBC) method, which can eliminate the bandwidth overlap constraint in the algorithm’s population evolution and achieve complete bandwidth flexibility in order to increase the quality of the optimal solution while decreasing the computational complexity. We develop a generic application architecture for metaheuristic algorithms using the NDBC method and successfully apply it to four typical algorithms. The results indicate that NDBC can enhance the quality of the optimized solution by 9–33% while simultaneously reducing computational complexity by 9–21%.


Introduction
Satellite communication systems have rapidly evolved into high-throughput satellite (HTS) systems over the past two decades, with data rates of tens or even hundreds of gigabits per second [1]. In the future, satellite communications will further develop into satellite-terrestrial networks [2], large-scale LEO constellations [3], or 6G networks [4], for which the efficient management of complex resources will be imperative [5]. Owing to the flexibility and complexity of the communication resources allocated in satellite communication systems, it is necessary to develop an automatic tool with efficient algorithms for management of dynamic resources-a problem known as dynamic resource management (DRM) [6]. In response, artificial intelligence algorithms, especially metaheuristic algorithms, are emerging as a fundamental solution to realize fully intelligent resource allocation and management [4]. Studies reported in the literature have focused on four subproblems of the DRM problem: power allocation, bandwidth allocation, joint resource allocation, and beam shape and placement. In addition, the latest research has focused on beamforming [2] and beam hopping [7], as well as their joint allocation with carriers, bandwidth, and power.
The power allocation problem entails maximizing system performance under the limited power of the satellite platform while minimizing total power consumption. The study of power allocation has practical value, as the power resources of satellite platforms are expensive and limited; increased power usage can improve signal gain and enhance user QoS. Choi and Chan [8] characterized the problem of power allocation as a convex optimization problem and introduced Lagrangian multipliers as a solutions, whereas Wang et al. [9] calculated the capacity using satellite link budget equations, which are more practical than the Shannon capacity formula used in [8]. Aravanis et al. [10] demonstrated that power allocation is an NP hard problem and examined the application of metaheuristic algorithms. Using the GA-SA algorithm and the NSGA-II algorithm, the authors proposed a two-stage, multiobjective optimization method to solve the Pareto front. In [11], an allocation model was created, taking interbeam interference and rain attenuation into account and solved using the PSO algorithm. Luis et al. [12] used a deep reinforcement learning (DRL) technique to achieve optimal power allocation in tens of milliseconds; however, the DRL network requires pretraining and could not achieve optimal results in all scenarios. In [13], the performances of GA, SA, PSO, DRL, and hybrid approaches were compared with respect to the power allocation problem. Takahashi et al. [14] combined power allocation with DBF (digital beamforming) to maximize the traffic accommodation rate of an HTS system by jointly optimizing the transmitted power, beam gain, and beam placement.
The bandwidth allocation problem is a subproblem of the frequency assignment problem, which involves assigning the bandwidth of a satellite system to the beam and the frequency subslot of the beam to each user. In [15], it was demonstrated that frequency assignment is an NP complete problem that is difficult to approximate. As a result, some researchers have employed mathematical programming techniques to solve the bandwidth allocation problem. To determine the optimal Lagrange multiplier, Park et al. [16] used a binary search, whereas Heng Wang et al. [17] utilized subgradient algorithms. The aforementioned studies offered solutions for a small number of beams, increasing system capacity or allocation fairness, but are inapplicable to larger datasets. Several studies have been conducted on high-dimensional scenarios. A DRL framework for dynamic channel assignment problems was presented in [18], and a novel image-like tensor reformulation was designed to extract the traffic demand features. In [3], a fast heuristic assignment algorithm for the frequency assignment problem of LEO constellations was proposed, which can achieve the goal of serving a greater number of users with fewer beams.
The joint resource allocation problem is an optimization problem that simultaneously allocates power and bandwidth, typically continuing to develop based on a single resource allocation and achieving increased allocation fairness or communication service capability than single resource allocation. Cocco et al. [19] designed an objective function that combines fairness and demand fit and proposed a stochastic optimization algorithm based on simulated annealing to solve the joint allocation of bandwidth and power. In [20], a link budget model considering interference between beams was proposed based on [10], as well as a GA-based joint resource allocation algorithm, which improved the USC (unmet system capacity) results of the fixed average algorithm by 55.7%. In [21], the PSO-GA and NSGA-II algorithms were used to examine the multiobjective optimization of power and carrier bandwidth for a scenario in which hundreds of beams serve a highly volatile demand. Abdu et al. [22] decomposed the JRA problem into two non-convex subproblems, i.e., carrier and power allocation, and solved them using a continuous convex approximation. He et al. [23] and Gao et al. [24] studied multiobjective optimization with respect to the JRA problem, with the difference that literature [23] used DRL, whereas literature [24] used a metaheuristic multiobjective algorithm, i.e., CMOPSO (multi-objective PSO algorithm with competition mechanism).
In summary, the power allocation and bandwidth allocation problems have been intensively studied, whereas the number of studies on joint resource allocation remains limited. Although DRL [12,18,25] has gradually spread in various satellite resource allocation scenarios, pretraining, generalization, and scenario dependency limit its versatility. Metaheuristic algorithms (e.g., GA, SA, PSO, etc.) are the main methods currently used owing to their speed, ease of encoding, and optimization performance. Nonetheless, the existing research is subject to some limitations: (1) As bandwidth can be used with reuse patterns in multibeam scenarios, any metaheuristic algorithm must deal with overlapping bandwidth constraints. A method called unused bandwidth allocation (UBA) has been proposed in the literature [20] to handle invalid solutions that do not satisfy the bandwidth constraint by repairing the beams one by one, simultaneously increasing the algorithm's computational complexity and decreasing its exploration capability. (2) Different metaheuristic algorithms have varying convergence times, which are significant in actual operational situations. In addition to the currently studied algorithms, such as GA and PSO, it is necessary to investigate additional metaheuristic algorithms with quicker convergence.
We address these issues by focusing on a typical subproblem of the DRM problem, namely the joint resource allocation problem (JRA) for HTS. Based on the literature [13,20,21,26,27], we construct a link budget model that takes into account interbeam interference and signal modulation patterns, approaching an actual scenario.
Then, a non-dominated beam coding (NDBC) method is proposed to address the shortcomings of the UBA method. We label adjacent rows of beams as dominant and non-dominant beams. The algorithm only encodes and optimizes the bandwidth of the non-dominated beams, and there is no bandwidth overlap constraint between them, such that they can be freely taken in the value domain, which effectively increases the search flexibility and improves the optimization quality of the algorithm. The bandwidths of the dominant beams are then obtained based on the non-dominant beams through a simple calculation. The computational complexity is also reduced, as invalid solutions are no longer repaired.
Finally, the NDBC method is applied to four common metaheuristic algorithms. The simulation results demonstrate that NDBC can effectively reduce the computational complexity while substantially improving the quality of the optimization solutions. We also find that the QPSO algorithm performs better than the GA, DE, and PSO algorithms on the JRA problem. We speculate that QPSO is more sensitive to search flexibility.
The remainder of this paper is structured as follows. In Section 2, we present the link budget model and derive a formula for the beam data rate. In Section 3, we present the unused bandwidth allocation method, propose a non-dominated beam-coding method, and suggest a generic framework for applying the NDBC method to the metaheuristic algorithm. Finally, in Sections 4 and 5, we analyze the simulation results and conclude this paper, respectively.

Joint Resource Allocation Problem Model
In this paper, we consider a GEO satellite in the Ka band with M fixed point beams and a multibeam technique in the four-color frequency reuse pattern [1], as depicted in Figure 1. When bandwidth allocation is performed, beams with different polarizations are independent of each other, whereas there is a bandwidth overlap constraint between adjacent beams using the same polarization.
The HTS system can flexibly allocate the bandwidth and power resources of each beam based on the traffic demand of each beam, thereby optimizing QoS and spectrum utilization. The resource allocation (RA) problem, a subset of the JRA problem, has been demonstrated to be an NP hard problem in the literature [10], implying that the JRA problem under consideration in this paper is also an NP hard problem. The objective function of the JRA problem involves minimizing the unmet system capacity (USC), which has been commonly used in the literature [10,20,21]. D i and R i are the traffic demand of the beam and the data rate provided by the system, respectively; BW i and P i are the bandwidth and power resources, respectively, allocated to beam i. The mathematical description of the JRA problem is as follows: where C 1 and C 2 are the power constraints of the model, typically handled by the proportional reduction method [20]; C 3 is the bandwidth maximum constraint; and C 4 indicates that the bandwidths of adjacent beams with the same polarization mode cannot overlap.

Link Budget Model
The carrier-to-noise power ratio of the satellite downlink can be calculated as follows [1,18]: where P t denotes the transmit power of the satellite antenna (assumed to be numerically equal to the power assigned to the beam); G t and G r denote the gain of the transmit and receive antennas, respectively; OBO is the output backoff factor, taken as a fixed value of 5 dB [20]; T sys represents the receiver system noise temperature, usually at 320 K [1]; L FS denotes the free-space path loss; L O denotes the sum of other losses; and BW is the beam bandwidth. We consider four primary interference factors, including carrier-to-adjacent-beam interference (CABI), carrier-to-adjacent-satellite interference (CASI), carrier-to-cross-polarization interferences (CXPI), and carrier-to-third-order intermodulation product interference (CIMI) [20]. Then, taking interference into account, the carrier-to-noise ratio is [27]: where E b denotes the bit energy; C is the carrier power; and R b represents the data rate (i.e., the bit rate, in bps), satisfying the relation R b = 1/T b , where T b denotes the bit duration and the relationship between the signal bit energy and the carrier power (i.e., E b = CT b ). We assume that the noise bandwidth is equal to the beam bandwidth. Therefore, the density ratio of energy per bit to noise plus interference is calculated as follows: The signal is modulated with a 4/8/16/32/64 element code, with the coding rate taking values from 1/4 to 9/10. Hence, the ratio of energy per symbol to noise plus interference power spectral density is [26]: Typically, such a system employs an ACM (adaptive coding and modulation) strategy to maximize spectral efficiency. According to [12], the beam data rate is: where α r denotes the roll-off factor, R s denotes the symbol rate, and Γ is the spectral efficiency (in bit/symbol) of the modulation and coding scheme (MODCOD), satisfying the following conditions [13]: where µ is the desired margin of the link (specified as 1.0 dB), whereas E s /N 0 | MODCOD denotes the threshold value required for the MODCOD scheme employed by the link to obtain the spectral efficiency (Γ) under ideal channel conditions. Accordingly, to calculate the data rate, we assume a certain MODCOD mode, then use Equations (6)-(10) to calculate E s /(N 0 + I 0 ). Then, we check whether condition (11) is satisfied, thus maximizing the spectral efficiency of the link to the greatest extent possible.

Unused Bandwidth Allocation
The metaheuristic algorithm contains operators with randomness, which cause some individuals not to satisfy the constraints and become invalid solutions. Solutions to deal with invalid solutions often include discarding, repairing, penalizing, and transferring; the UBA described below uses the repair method.
As depicted in Figure 2, the unused bandwidth allocation (UBA) method [20] is utilized primarily to address the bandwidth constraint of invalid solutions. When using UBA, all beams are encoded and optimized, such that all neighboring beam pairs have bandwidth overlap constraints.
We assume a sequence of adjacent beams with N b beams; a corresponding individual in the algorithm is X = [XP, XB], where XP and XB denote the power vector and the bandwidth vector, respectively. The bandwidth constraint can be handled as follows: (1) With a probability of 0.5, the starting beam is selected as 1 or N b . The beam pair is expressed as follows: (2) For each pair of beams (beam a , beam b ), if xb a + xb b > 1 is satisfied, the operation xb b = 1 − xb a is performed in order to ensure that constraint C 4 is satisfied; however, some bandwidth may be unused. Using a column of four adjacent beams as an example, Figure 3 illustrates the underlying principle of this step. As bandwidth overlap is eliminated, unused bandwidth is gradually generated and expanded. Consequently, the bandwidth of some beams is altered; this phenomenon has a detrimental effect on the search direction of the algorithm. It is possible that when the beam's bandwidth is increased to improve the objective function, a smaller value is obtained after this step (and vice versa), causing the algorithm to search in the opposite direction. (3) The beams are ordered from largest to smallest based on the demand. The serial number sequence is denoted as SortD.
where xb L and xb R denote the bandwidths of the adjacent beams when beam i has two adjacent beams, whereas xb A is the bandwidth of the adjacent beam when beam i has only one adjacent beam. As shown in Figure 2, for population evaluation, the bandwidth vector must be transformed into a bandwidth allocation vector: The UBA method has two issues: First, in step 2, it needs to repair the positions of a significant number of particles in the algorithm population while only ensuring that this portion of particles satisfies the constraint rather than having performed a valid search. These particles do useless work with greater probability, in addition to increasing the computational complexity. Second, the low-bandwidth flexibility of each beam affects the algorithm's ability to search the solution space, reducing its global optimization capability.

Non-Dominated Beam Coding
We assume that beams with different polarizations are independent; therefore, we focus on the bandwidth overlap between adjacent beams with the same polarization. We assume that beam B dominates beam A when beam A is adjacent to beam B and that the bandwidth of beam A is influenced by beam B. If we treat the bandwidths of the beams sequentially, the dominance phenomenon is almost ubiquitous; however, if we assign values to the bandwidth beams at intervals, there is no influence between them.
As depicted in Figure 4, the adjacent non-dominant beams dominate the bandwidth of the dominant beam, and there is no bandwidth overlap constraint between non-dominant beams. The bandwidth vector (XB ND ) can be encoded as follows: As demonstrated in Figure 5, the NDBC method encodes only the non-dominated beams and generates directly corresponding bandwidth values, whereas the bandwidth of the dominated beams is determined by their respective adjacent dominated beams.
Once the population has evolved during the population evolution session, it must be converted to a bandwidth allocation vector in order to ensure that the bandwidth constraint is met. The conversion method is as follows:

Complexity Analysis and General Application Architecture
For the NDBC method, the computational complexity of the method is primarily attributable to the computation of the dominant beam's bandwidth. We assume M nd as the total number of non-dominated beams. The computational complexity is as follows: Therefore, the computational complexity of NDBC is approximately one-fourth that of UBA. As shown in Table 1, we investigated the degree of beam dominance for UBA and NDBC in order to determine the effect of the bandwidth constraint handling method on the search flexibility of the algorithm. As the order is randomly reversed in step 1 for UBA, the first and last beams are actually dominated by their neighboring beams. The non-dominated beams in NDBC are not dominated by one another, whereas the first and last beams are dominated by only one of their neighboring beams. We quantify the number of neighboring beams that can exert influence on a beam as its degree of dominance; the greater the degree of dominance, the less flexible the beam's bandwidth.
As shown in Table 1, assuming a column of adjacent beams with M beams, the UBA method is dominated by two adjacent beams for almost all beams. On the other hand, NDBC only needs to encode and optimize the bandwidth of the non-dominated beams, thereby practically achieving complete search flexibility. This modification effectively enhances the algorithm's global optimization performance. The general architecture of the NDBC method applied to the metaheuristic algorithm is depicted in Table 2.

Results
The simulation scenario was configured as a GEO multibeam satellite with 37 beams. Figure 6 depicts the beam distribution. The majority of the simulation scenario parameters were obtained from [1,20]. Table 3 details some of the most important parameters.  Standard metaheuristic algorithms, such as GA [20], DE [28], PSO [29], and QPSO [30], were utilized in our simulations. For a fair comparison, each algorithm's population size and termination conditions were set to the maximum number of iterations. Table 4 provides the algorithm parameters. A DVB-S2 [31] and DVB-S2X [32] hybrid MODCOD table was used, which was screened such that as the threshold value increased, the spectral efficiency increases uniformly.
As depicted in Figure 7, the execution time of the constraint handling session varied across iterations due to the random nature of the algorithm's evolution. NDBC was more stable and required less time to execute each algorithm than UBA. The execution time of constraint handling is proportional to the number of iterations. Quantitatively, NDBC required approximately one-fourth to one-fifth of the time required by UBA, achieving an effective reduction, indicating that the computational complexity of NDBC is lower than that of UBA. This result is consistent with the analysis presented in Section 3.3 regarding computational complexity.  The optimal results of each algorithm with UBA and NDBC are depicted in Figure 8. According to the model presented in this paper, the USC was 1.967 Gbps when bandwidth and power are distributed uniformly (denoted as average bandwidth and average power allocation, ABAP). Figure 8a illustrates the execution time of each algorithm for varying population sizes. Except for the GA, the relationship between the algorithm's execution time and population size was approximately linear. The DE, PSO, and QPSO algorithms executed faster than the GA algorithm. In comparison to the UBA method, the NDBC method reduced the total execution time of each algorithm by 9-21%. Figure 8b depicts the mean values of each algorithm's optimized solutions for varying population sizes. The quality of the optimal solution improved proportionally to the size of the population. As depicted in the graph, the NDBC method significantly improved the optimized solution of each algorithm by 9-33%. In terms of algorithm comparison, the strength of algorithm optimization capability is ranked as QPSO, PSO, DE, and GA. However, when using UBA, this ordering is PSO, DE, GA, and QPSO. We speculate that QPSO is more sensitive to the flexibility of the search compared to other algorithms. Increasing the population size can result in a more optimal solution, but it also increases the execution time. In the following simulations, the population size was set to 400 in order to control the execution time of the algorithm without causing it to take too long. The maximum and minimum number of iterations were 500 and 200, respectively. After 50 iterations, the algorithm terminated if the improvement in the optimized solution was less than 5 × 10 −5 . Figure 9 depicts the convergence curves with the best results from 50 executions of each algorithm under the threshold termination condition, reflecting the convergence speed and optimization capability of each algorithm. As depicted in Figure 9, GA and QPSO had faster convergence rates, reaching convergence after approximately 100 iterations. The NDBC method enhanced the quality of each algorithm's optimized solution and accelerated their convergence, and DE and QPSO showed greater potential. In combination with the algorithm execution time, QPSO can achieve the best optimized solution in the shortest time, indicating that QPSO performs better when the search flexibility is high. Figure 9. Convergence for the best execution of the algorithms using different bandwidth constraint handling methods. The dashed and solid lines in the figure represent the UBA and NDBC methods, respectively, whereas different colors represent different algorithms. Table 5, the improvement in GA (UBA) in the mean USC metric (0.958) was 51% compared to UBUP (1.967), as opposed to 55.7% (1.9→0.84) reported in the literature [20]. We modified the link budget model based on [20] and used a different MODCOD table such that the results slightly differed. The best average USC result of the QPSO algorithm utilizing NDBC was 0.641, representing a 67.4% improvement relative to UBUP (1.967) and a 21% improvement relative to the result reported in [20]. The NDBC method improved all USC metrics in comparison to the UBA method, with improvements of 9-36% for the USC mean and 14-38% for the execution time. For an efficient comparative analysis, we selected the ABAP algorithm, the GA(UBA) scheme from [20], the comparison algorithm GA(NDBC), and the QPSO(NDBC) algorithm, which applied NDBC most effectively. Figure 10 depicts the capacity allocation and USC outcomes for each beam based on the four previously mentioned algorithms. Using the GA algorithm, NDBC was able to achieve better allocation results for beams 6, 11, 18, 24, and 36 in comparison to UBA. QPSO was able to improve the USC results in beams 2 and 18 using NDBC in comparison to GA. Overall, NDBC reduced resource allocation for high-demand beams, improved USC, and increased allocation fairness. An important aspect of satellite resource allocation, i.e., traffic demand, depends on the specific geographical location and the associated time of day; therefore, when considering a large footprint composed of multiple beams (as in this paper), with beams experiencing varying capacity requests that also vary in time (slowly, i.e., on the order of hours). In this line, resource allocation in continuous time scenarios has been researched in various studies [12,18,19]. Simulation experiments were subsequently conducted in order to determine whether the application of the NDBC method in the continuous time case deviated from the results in the static scenario.

As shown in
Assuming that the geographical characteristics that influence beam demand remain constant, the demand is primarily influenced by temporal characteristics. The demand ratio of each point beam was considered to be identical to the stationary scenario described previously, with a random deviation of 20% or less added to simulate the actual situation. Figure 11 illustrates the total demand, with a sampling interval of 10 min and a scheduling time limit of 3 s. The maximum and minimum number of iterations was 500 and 200, respectively. After 50 iterations, the algorithm terminated if the improvement in the optimized solution was less than 5 × 10 −5 . Figure 11. Evolution of total demand over time.
As shown in Figure 12a, the USC values of NDBC for all algorithms were superior to those of UBA in the majority of instances, whereas the application of the QPSO and PSO algorithms provided superior results compared to GA and DE. In conjunction with Table 5, it is evident that the stability of PSO and QPSO utilizing NDBC is superior to that of GA and DE (the standard deviations of the average USC were 46, 46, 65, and 64, respectively). We selected the average allocation algorithm (abbreviated as ABAP), the scheme GA(UBA) from [20], the comparison algorithm GA(NDBC), and the QPSO algorithm applying NDBC, with the best results, in order to investigate the service capability of the four aforementioned algorithms in the continuous demand scenario. Figure 12b depicts the following phenomena: (1) the metaheuristic algorithm improved the serviceability of the fixed-average algorithm, (2) NDBC can improve the service capability of UBA when the algorithm is GA, and (3) when using NDBC, QPSO can achieve a better solution than GA. Owing to the limited resources, the system service capacity reaches the upper session limit of its service capacity during the peak demand period of the day. In general, the NDBC method proposed in this paper can adapt to continuoustime scenarios. The red vertical line indicates where the USC of NDBC is better than that with UBA, and vice versa for grey; and (b) the actual offered capacity calculated by subtracting the USC value from the total required capacity at each instant (this value is usually not equal to the total allocated capacity).

Conclusions
In this paper, we proposed a non-dominated beam-coding (NDBC) method to solve the adjacent bandwidth overlap constraint in the HTS joint resource allocation problem, achieving full bandwidth flexibility in the evolution of the algorithm while decreasing computational complexity. By applying the NDBC method to a metaheuristic algorithm, it is possible to reduce the algorithm's execution time while simultaneously enhancing the quality of the resource allocation results. Finally, a general architecture for applying NDBC to metaheuristic algorithms was developed. Simulations demonstrated that NDBC can reduce the computational complexity of the bandwidth-constrained handling session by 75-80% and the overall algorithm execution time by 9-21% while improving the USC results by 9-33%. We also found that when NDBC was applied, the PSO and QPSO algorithms performed better than the GA and DE algorithms. Importantly, the model does not consider all aspects of a multibeam satellite communication system, as doing so would make the problem extremely complex and difficult to describe. Our conclusions remain unaffected by factors not accounted for in the model, such as the antenna pattern, satellite payload characteristics, overall frequency plan (feeder and user links), rain attenuation, and precoding. The results reported herein are instructive for the application of metaheuristic algorithms to multibeam satellite DRM problems.