1. Introduction
In fields such as engineering simulation, intelligent manufacturing, energy systems, and control engineering, the solution of large-scale complex models is increasingly reliant on high-performance computing platforms. Cloud computing integrates a large number of heterogeneous physical servers into elastic computing power that can be rented on demand through virtualization and resource pooling, providing a flexible computing power foundation for such engineering applications, but also bringing new problems such as rising energy consumption, uneven resource utilization, and high operation and maintenance costs [
1]. This makes computing power scheduling not only about “availability” but also necessitates controllable optimization of cost and energy efficiency under service level agreement (SLA) constraints [
2]. From the viewpoint of symmetry/asymmetry, cloud scheduling typically contains permutation symmetry when a subset of hosts (or VM types) are interchangeable—renumbering those resources yields alternative schedules with essentially identical physical meaning and cost—whereas real deployments are dominated by heterogeneity in performance/energy profiles and by time-varying electricity/carbon prices, which break such symmetry and require asymmetric, context-adaptive decisions. Therefore, how to stably obtain high-quality scheduling schemes under multiple constraints, heterogeneous resources, and price fluctuations has become a key capability of cloud platform operation.
The core of cloud server computing resource scheduling is to optimize metrics such as completion time, energy consumption, and economic cost by achieving the mapping and adjustment among tasks, virtual machines, and physical hosts under capacity, timing, and SLA constraints. When part of the infrastructure is homogeneous, the scheduling formulation is (approximately) invariant to permutations of those identical hosts, which induces a large set of symmetric, cost-equivalent solutions and may create plateaus in the search landscape. The problem also includes discrete decisions such as placement, migration, and power-on/off, as well as continuous variables such as frequency adjustment, and has strong coupling constraints and dynamic task arrival characteristics, which usually belong to NP-hard combinatorial optimization problems [
3]. Therefore, the scheduling algorithm needs to take into account global exploration capability, feasibility improvement speed, and convergence stability, and be robust to heterogeneity-induced asymmetry such as performance diversity and price/carbon cost fluctuations [
4].
Facing multi-tenant, long-time-series industrial scenarios, scheduling objectives and constraints have expanded from single performance metrics to operational costs like energy consumption, electricity/carbon price, and equipment lifespan, as well as service quality metrics like response time, violation rate, load balancing, and resource utilization [
5]. However, existing research often employs fragmented rather than unified multi-objective optimization frameworks to address these diverse service quality indicators. Weighted-sum aggregation or Pareto-based ranking typically relies on manually specified preferences and inconsistent measurement scales, which breaks the “scale symmetry” (i.e., cross-scenario dimensional comparability) of evaluation and makes objective comparisons across different price systems and load regimes difficult [
6]. At the modeling level, existing meta-heuristic algorithms often face an expanded search space due to the mixture of high-dimensional discrete decisions (mapping/migration/power states) and continuous DVFS variables, while ignoring the symmetry redundancy in the assignment space caused by partial resource interchangeability. This renders the feasible region sparse and the search more challenging [
7].
To address these challenges, this paper proposes a Dual-Improved Whale Optimization Algorithm (DI-WOA) and its supporting modeling framework for the cloud server computing resource scheduling problem. The DI-WOA is not merely a single variant for iterative performance enhancement of the WOA, but rather an integrated dual-rollback optimization system comprehensively equipped with the ability to perceive and handle symmetric redundancy and heterogeneous asymmetry.
To make the symmetry/asymmetry notions explicit, we summarize the main symmetry types encountered in the studied cloud scheduling model and the corresponding mitigation mechanisms in the DI-WOA. In particular, the DI-WOA targets permutation symmetry induced by interchangeable homogeneous hosts, scale symmetry caused by fragmented multi-objective aggregation with inconsistent units, and price symmetry plateaus that arise when congestion is not priced. Meanwhile, the algorithm exploits heterogeneity-induced asymmetry (e.g., non-uniform energy/performance curves and time-varying price signals) through dual-guided decoding and closed-form DVFS decisions. Various concepts related to symmetry are presented in
Table 1.
The main contributions of the DI-WOA are as follows:
Within the objective-function-based modeling framework, a unified monetization mechanism is adopted to convert heterogeneous cost components—such as energy consumption, electricity/carbon cost, task delay, and migration/boot–shutdown overheads—into the same “bill cost” dimension, yielding a scale-consistent (dimensionally comparable) single-objective evaluation across scenarios.
Within the decision-variable-based modeling framework, a discrete–continuous divide-and-conquer modeling is employed; the population search is conducted only in a real-valued vector space mapped from a compact set of discrete genes (task priority, preferred host, and host–time-slot core activation ratio), while continuous frequency variables are solved analytically by a closed-form frequency step.
Within the constraint-handling and iterative fitness evaluation modeling framework, by combining hard/soft constraint separation and an enhanced Lagrangian dual-rollback mechanism, adaptive congestion pricing is introduced to gradually assign asymmetric prices to hosts with varying congestion levels. This breaks the originally redundant, cost-equivalent allocation schemes arising from price symmetry, forming an empirical convergence trajectory characterized by “gradually increasing feasibility rate and monotonically decreasing objective value,” thereby further improving search efficiency.
In the encoding and decoding mechanism, the three-layer structure of “discrete genes–real-value encoding–decoder” establishes a mapping channel connecting the high-dimensional discrete search space to a low-dimensional subspace imbued with physical meaning. This cleverly transfers the DI-WOA’s search behavior to a subspace that is lower-dimensional, more continuous, and richer in physical semantics compared to the high-dimensional discrete search space. As the dimensionality of the search space is reduced, the number of symmetric redundant solutions decreases, thereby achieving implicit symmetry-based dimensionality reduction for redundant equivalent scheduling solutions.
Subsequent sections will systematically present the formal definition, implementation details, and experimental results on scalability, robustness, and generalizability of this scheduling model and the DI-WOA, with comparative validation against typical meta-heuristic algorithms.
2. Related Work
Existing research on cloud server scheduling can be categorized into three types: rule/heuristic-based methods, population intelligence and evolutionary computation-based meta-heuristic methods, and multi-strategy integrated hybrid and hyper-heuristic methods. Rule-based methods (e.g., Shortest Job First, Minimum Completion Time, and Minimum Load) are simple to implement with low computational overhead, but struggle to simultaneously balance energy consumption, cost, and SLA under strong constraints and large-scale scenarios [
8]. Meta-heuristic methods (PSO, GA, ABC, Firefly Algorithm, etc.) improve solution quality and scalability through global search [
9]; further, multi-objective meta-heuristic frameworks incorporate metrics like energy consumption, resource utilization, and SLA violation for joint optimization, enriching the cloud scheduling algorithm system [
10]. From a symmetry/asymmetry perspective, many scheduling and placement formulations exhibit permutation symmetry when multiple machines/hosts are identical or interchangeable: renumbering such resources produces alternative schedules that are equivalent in feasibility and objective value. This symmetry enlarges the effective search space and may lead optimization methods to spend evaluations on redundant solutions. In the broader scheduling and combinatorial optimization literature, symmetry breaking has been widely studied to prune symmetric alternatives and accelerate solving, while recent discussions in optimization highlight that recognizing and leveraging symmetry can reduce redundancy and improve convergence behavior. In cloud environments, this issue coexists with heterogeneity-induced asymmetry (e.g., nonuniform performance/energy curves and dynamic price/carbon signals), which makes symmetry-aware yet asymmetry-adaptive algorithm design particularly relevant to practical schedulers.
In terms of virtual machine placement and resource configuration, a large number of studies have regarded energy consumption as one of the core optimization objectives of the scheduling problem. Some studies model VM scheduling as an energy minimization problem, aiming to reduce overall data center power consumption under performance constraints by controlling server power-on/off states, consolidating low-load VMs, and reducing migration overhead [
11]. Other research proposes energy-efficiency-aware VM placement frameworks, constructing scheduling strategies based on server energy efficiency curves and utilization intervals, prioritizing high-efficiency operating regions during scheduling to lower energy consumption without significantly worsening task completion time [
12]. Furthermore, multi-objective VM placement methods consider SLA violation rate and load balance alongside energy metrics, employing multi-objective population algorithms or techniques like Quantum Particle Swarm Optimization to construct joint objectives, achieving trade-offs among energy consumption, service quality, and resource utilization [
13]. In addition, the heat-aware VM placement model reveals the comprehensive impact of server layout and load allocation on energy consumption and hardware reliability by introducing factors such as temperature distribution and cooling costs, and emphasizes the importance of explicitly considering thermal constraints in energy efficiency optimization [
14].
For cloud workflows and batch task scheduling with dependencies, researchers generally regard them as multi-objective optimization problems, focusing on the trade-off between execution time and energy consumption or cost. Some works construct multi-objective models for workflows, take total energy consumption and makespan as the main indicators, and use multi-objective evolutionary algorithms to search for a set of non-dominated scheduling schemes to meet deadline and resource constraints [
15]. Other research, from a cost–time trade-off perspective, integrates factors like task execution cost, completion time, and SLA violation into workflow scheduling models, constructing near-optimal scheduling strategies by combining heuristics with evolutionary algorithms [
16]. Based on this, some methods further introduce various biological heuristic operators and local search mechanisms to improve the convergence speed and solution set diversity of multi-objective workflow scheduling, demonstrating good adaptability in large-scale, highly coupled cloud workflow scenarios [
17].
The Whale Optimization Algorithm (WOA), due to its simple structure, few parameters, and ease of integration with specific problem encoding, has found wide application in cloud task scheduling and VM placement. This algorithm simulates the bubble-net feeding behavior of humpback whales, switching between shrinking encircling mechanisms and spiral updating position to balance global exploration and local exploitation [
18]. Scheduling research based on this algorithm has proposed various task–VM mapping frameworks, constructing multi-objective models primarily focusing on completion time and execution cost to obtain superior task scheduling schemes in large-scale cloud environments [
19]. Subsequent work has proposed various enhancement strategies around algorithm initialization, search operators, and control parameter design, such as improving position update methods and introducing adaptive weights in edge computing scenarios to enhance convergence speed and the ability to escape local optima under complex constraints [
20]. Additionally, researchers have constructed different improved Whale Optimization Algorithms for cloud task scheduling, integrating local search, mutation operators, and task-feature-guided mechanisms into the basic framework, validating their advantages in metrics like makespan and cost under various task scales and resource configurations [
21]. Review studies have systematically summarized the application of the WOA and its variants in cloud task scheduling, load balancing, and workflow optimization, noting that while such methods possess advantages in solution quality and scalability, there is still room for improvement in aspects like objective modeling unity, complex constraint handling, and parameter adaptation [
22]
Besides Whale Optimization Algorithms and their variants, hybrid meta-heuristic and hyper-heuristic methods for cloud task scheduling have also made significant progress in recent years. Some biogeography-based optimization algorithms incorporate migration and mutation mechanisms, combined with the TOPSIS method to calculate distances between nodes and ideal solutions, balancing execution time, energy consumption, and cost in multi-objective optimization for MEC task offloading [
23]. Some cutting-edge methods combine the HS-HHO hybrid algorithm with task clustering strategies, effectively balancing latency and energy consumption for task offloading in edge-cloud collaborative scenarios, thereby improving system energy efficiency and convergence precision [
24]. Some researchers have adopted enhanced multiverse optimization algorithms in task scheduling. By introducing fitness reordering, neighborhood search, and multi-policy perturbation mechanisms in the search process, they have taken into account task completion time, cost, and resource utilization, and achieved better performance than traditional meta-heuristics on various benchmark datasets [
25]. Some works have also introduced elite learning and multi-objective modeling mechanisms on the basis of the Harris Eagle Optimization Algorithm, so that the scheduling process can achieve a more balanced trade-off between multiple indicators such as load balancing, makespan, and scheduling length [
26]. For green data center scenarios, some methods use multi-objective evolutionary algorithms such as fuzzy NSGA-II to closely combine DVFS technology with task scheduling models, and construct non-dominated solution sets between energy consumption, execution time, and resource utilization, so as to better meet the dual requirements of energy saving and performance [
27].
Overall, existing studies have yielded rich results in cloud task scheduling and VM placement, but three common shortcomings persist:
At the objective function level, most work employs weighted sums or Pareto ranking, relying on artificial settings for evaluation scales, making objective comparison across different price systems and load scenarios difficult;
At the modeling level, task mapping, DVFS, and power on/off/migration decisions are often mixed within a high-dimensional discrete–continuous space, making the feasible region sparse and search difficulty high;
At the constraint handling level, static penalties or strict hard constraints can weaken exploration capability, leading to slow feasibility rate improvement or unstable convergence.
Based on these common shortcomings of heuristic and hybrid optimization algorithms for solving cloud task scheduling problems, this paper selects the Whale Optimization Algorithm as the core skeleton of the overall optimization framework. The WOA holds the following advantages over other heuristic algorithms for cloud task scheduling problems:
The WOA contains few hyperparameters, maintaining a lightweight structure among similar meta-heuristic algorithms, with algorithmic complexity far lower than hybrid meta-heuristics, and even simpler parameters that are easier to be compatible with various improvement mechanisms mentioned later;
The WOA possesses a dual search mechanism, capable of balancing global exploration and local exploitation, making it easier to escape local optima compared to many early classical meta-heuristics, while achieving convergence ability comparable to complex hybrid algorithms with the simplest algorithmic structure;
The WOA’s iterative mechanism involves fewer complex operators compared to other meta-heuristics and hybrid heuristics, placing lower demands on computational resources required for iterative solving;
Numerous hybrid algorithms and improved variants of the WOA continue to emerge, all demonstrating steady progress in cloud scheduling problems, indicating stronger extensibility for future research compared to many hybrid heuristic algorithms that have already incorporated multiple algorithmic mechanisms.
Therefore, based on the common shortcomings of existing research and the strengths of the WOA, this paper constructs a single-objective model with a unified monetized bill cost at its core, employing discrete–continuous divide-and-conquer and an enhanced Lagrange dual-rollback mechanism to gradually improve the feasibility rate while maintaining explorability, thereby enhancing solution stability and cross-scenario comparability under complex constraints.
3. Methodology
3.1. Problem Modeling and Variable Definition
This paper considers the set of discrete time slots , the set of physical hosts , and the set of tasks . Task has an arrival slot , a deadline slot , a total workload , and a memory requirement . Host has a core number upper limit , a memory upper limit , a per-unit-frequency single-core productivity coefficient , and supports DVFS with frequency . The unit slot length is , and the energy/carbon price is . Idle power and dynamic energy consumption coefficients are denoted as and , respectively. The booting cost is . The unit prices for task overdue and migration/boot–shutdown costs are and , respectively, with the virtualization loss coefficient approximately representing a linear reduction in effective core count per active VM.
Decision variables include binary indicators and continuous allocations: host power-on indicator , activated core number , frequency ; task occupation indicator and workload allocation . Additionally, denotes the amount of arrears at the deadline, and represents the intensity of migration/boot–shutdown events during the period (used for billing, not as a hard constraint).
Under the single-objective framework, a unified “bill cost” is used to measure scheduling quality. Idle and dynamic energy consumption originate from the classic DVFS mechanism (power approximated as
; under fixed allocation, the optimal frequency tends to be at the productivity boundary, and dynamic energy can be formulated as a convex term proportional to
“allocated workload”). Therefore, the total cost objective function can be written as:
The variable meanings in Equation (1) are as follows: is the unit energy/carbon price at slot ; is the idle power of host ; is the unit slot length; is the dynamic energy coefficient of host ; is the unit-frequency single-core productivity coefficient of host ; is the frequency of host at slot ; is the on/off state of host at slot t; is the workload allocated to task on host , slot ; is the startup cost of host ; is the power-on event indicator for host at slot (derived from ); is the tardiness unit price for task ; is the unfinished workload of task after its deadline slot; is the unit migration/on-off cost for task ; is the intensity of migration/on-off events occurring at slot . It must be emphasized that , , , and stem from price and contract parameters rather than arbitrary tunable weights, rendering the entire optimization genuinely single-objective and avoiding the subjectivity and instability of multi-objective weighting.
Figure 1 depicts the overall methodological flow of data and parameters entering the “gene → decode → closed-form frequency → cost and dual update” cyclic chain. Population search operates on the real-valued vectors mapped from discrete decisions, while continuous variables are directly provided by the decoder in a physically interpretable closed form.
3.2. Representation Method and Feasible Decoding
Considering the sensitivity of population-based heuristics to hard constraints, to make it easier for meta-heuristic algorithms to search for feasible solutions, the constraint setup in modeling requires a soft–hard separation mechanism. This paper retains only the minimum necessary hard constraints, which must be strictly satisfied during the algorithm’s iterative solution search, including:
A task does not run replicated in the same time slot, i.e.,
Variable domain legality and DVFS range clipping, ensuring and
Soft constraints are allowed to be slightly violated during iterative solving but are subject to penalty processing via the enhanced Lagrangian mechanism after the solving process. Only one key capacity constraint is retained as soft, and the virtualization loss is endogenously folded into the effective core count, significantly reducing the number of constraints. Specifically, the concurrent VM count is defined as
. The effective core number is:
Then, the “violation amount” of the capacity soft constraint is denoted as:
When , it indicates that the slot’s capacity is sufficient to cover the allocation; represents an overload amount that needs to be corrected via rollback in subsequent iterations. Memory capacity constraints are enforced as hard constraints during the decoding phase. The decoder, when selecting a host and time slot for a task, first attempts placement according to the order of “preferred host → candidate hosts with lower dual prices”, but each attempt must pass a memory feasibility check. If the current host has insufficient remaining memory in that time slot, it immediately falls back to the next candidate host or later time slot, disallowing the generation of memory-exceeding solutions via “temporary occupation”. In contrast, computational capacity constraints are treated softly. Their violation amounts are used to construct the enhanced Lagrangian penalty and update dual prices, thereby guiding task workload migration towards less congested slots in subsequent iterations. Migration and boot–shutdown events are accounted for via cost terms in the bill cost rather than being treated as hard constraints.
The coupling of discrete and continuous variables generates a high-dimensional search space, causing the number of symmetric redundant solutions to increase with the rising dimensionality of the search space and rendering the feasible region sparse. Therefore, this study employs a “discrete–continuous divide-and-conquer modeling” mechanism to decouple discrete and continuous variables mixed within the same high-dimensional space, strictly splitting them into two separate solution paths for processing, thereby achieving preliminary dimensionality reduction in the search space. Discrete variables are subsequently mapped to a subspace for further dimensionality reduction. The three specific types of discrete variables include:
Task priority sequence ;
Per-task preferred host vector ;
Host–time slot core activation ratio matrix .
Continuous variables (e.g., per-slot frequency) are not iteratively updated within the DI-WOA framework. Their feasible optimal solutions are obtained by the decoder during the evaluation phase through closed-form frequency step solving and are independently clipped on each slot, ensuring physical consistency between energy and productivity, effectively preventing blind algorithm search.
The “discrete–continuous divide-and-conquer modeling” mechanism improves decision-level decoupling and reduces the search space dimensionality, transferring the “algorithmic search framework” for cloud resource scheduling into a lower-dimensional subspace. Consequently, the number of symmetric redundant solutions decreases, playing an important role in symmetry reduction.
Figure 2 illustrates the decoder’s working mechanism. The decoder processes tasks sequentially, prioritizing the highest-priority task first, assigning it to its preferred host at its arrival slot
. If memory is insufficient at that slot, it directly falls back to other hosts with lower prices. If the preferred host has insufficient memory at that slot, it falls back to candidate hosts sorted by lower congestion prices. If the current slot’s
is less than the required capacity to handle the total task load, assuming the current task is added, then the current task is first split and placed until the current slot is filled. The remaining workload is postponed to subsequent time slots for continued filling. Subsequent Lagrangian enhanced dual terms gradually increase, forming a closed-loop guidance directing the search direction towards satisfying soft constraints as much as possible.
The closed-form frequency step adopted in this study can “push continuous variable optimization to the boundary” given the total allocation amount
and the effective core number
. This avoids blind search on continuous variables, significantly reduces dimensionality, and enhances convergence stability. The optimal frequency for minimizing energy while satisfying the capacity boundary is:
where
calculates the ideal optimal frequency without upper/lower bound constraints; the clip function here performs the operation of clipping the ideal optimal frequency to the permissible DVFS range
. This closed-form solution ensures that, given the allocated workload and effective core count, the frequency is set to a physically feasible optimal frequency that simultaneously meets processing demands and constraints imposed by hardware performance limits, thereby eliminating the algorithm’s need for iterative search over the continuous frequency variable.
3.3. Enhanced Lagrangian Dual-Rollback Mechanism
To maintain sufficient explorability of solutions during algorithm iterations, capacity soft constraints are not entirely prohibited from being violated during the decoding phase. Instead, in the evaluation phase after decoding is complete, the total penalty incurred by constraint violations in the solution space is calculated statistically as “violation amounts”. To gradually compress the violation degree while maintaining exploration, this paper incorporates capacity soft constraints into the enhanced Lagrangian framework:
The population algorithm uses
as the fitness evaluation criterion within each generation, continuously reducing the enhanced Lagrangian value through updates to
. Between generations,
and
are updated based on the violation pattern of the current best individual, thereby gradually increasing the “price” of congested slots and the cost of violating them. This paper employs a mirrored gradient ascent with a geometric scaling update rule. Let the optimal solution of generation
be
, with its corresponding violation amount
. The steps are:
where
is the dual step scale, and
and
control the amplification of the penalty coefficient and the decay of the step size, respectively. Intuitively, slots that are more frequently violated acquire higher dual prices. Consequently, in subsequent decoding and task filling processes, they are considered “expensive resources” due to increased penalty terms, and the decoder tends to migrate tasks towards lower-price regions. As iterations proceed, the increase in
drastically amplifies any residual violation in the objective, prompting the algorithm to focus its efforts in later stages on searching for local optima near feasible solutions, presenting the empirical convergence trajectory of “feasibility rate rising, objective monotonically decreasing.”
Figure 3 illustrates the complete process where discrete genes are decoded into specific scheduling solutions, followed by the update of the
heatmap based on the capacity constraint violation amounts of each host in the specific scheduling solution. Discrete genes are input into the decoder to generate a concrete scheduling solution (referring to the current λ heatmap). Subsequently, the capacity soft constraint violation amount for each host in the concrete scheduling solution is tallied. Finally, based on the tallied capacity soft constraint violation amounts, the λ heatmap is updated by increasing the dual prices for hosts with larger violation amounts.
Overall, the enhanced Lagrangian dual-rollback mechanism is essentially a progressive constraint-handling mechanism of “exploration–guidance–convergence.” In early iterations, penalty charges for soft constraint violations on each host are low, allowing a small number of tasks to exceed deadlines. The iterative trajectory explores mildly infeasible regions, maintaining algorithm friendliness towards early exploration and avoiding premature convergence to local optima due to overly restrictive constraints on the search space. As the number of iterations increases, the dynamic dual pricing mechanism gradually raises penalty charges for soft constraint violations on congested hosts, avoiding “sudden pressure” and guiding the algorithm with sufficient time to transition from “allowing trial and error” to “satisfying constraints.” The feasibility rate gradually improves, strengthening stability in constraint handling. In later iterations, penalty charges for soft constraint violations on severely congested hosts have risen to extremely high values. Violating capacity soft constraints incurs severe penalties, forcing the iterative trajectory to stably converge towards feasible solutions. At this stage, the constraints exert sufficient restrictiveness and stability.
3.4. Dual-Improved Whale Optimization Algorithm
To reduce search variable dimensionality, diminish symmetric redundant solutions, and alleviate feasible region sparsity, this paper selects the technical chain of “discrete decision abstraction → low-dimensional real-value mapping → physical feasible solution restoration” to achieve DI-WOA search space dimensionality reduction. Based on this technical chain, a three-layer structure of “discrete gene–real-value encoding–decoder” is constructed. The specific roles of each layer are as follows:
The discrete gene layer abstracts the core decision elements of cloud scheduling into physical representations, specifically comprising: task priority ordering order, task preferred host vector host_pref, and host–slot core activation ratio matrix core_ratio. This layer retains only the key discrete variables affecting scheduling solution quality, enabling preliminary decoupling and dimensionality reduction in the variable space in conjunction with the discrete–continuous separation mechanism, as well as preliminary implicit symmetry reduction for redundant solutions.
The core function of the real-value encoding layer is to map discrete genes into continuous real-value vectors via a random key mechanism. On the one hand, a random key vector of length is used to sort tasks; on the other hand, the preferred host index is normalized to the [0,1] interval, and the core ratio matrix is flattened in row-major order, ultimately yielding a real-valued vector of length . This layer further reduces the discrete variable space to a lower-dimensional subspace, facilitating algorithm search within a lower-dimensional continuous search space and achieving further implicit symmetry reduction for redundant solutions.
The decoder layer is the key component for transforming abstract search results into concrete, feasible scheduling solutions. Specifically, it first inversely maps the encoding vector back to discrete decisions via a decoding function; then, combined with closed-form frequency calculation and constraint checking, it generates high-quality feasible scheduling solutions that satisfy hard constraints and are adapted to soft constraints.
Overall, this three-layer structure establishes a mapping channel connecting the high-dimensional discrete search space with a physically meaningful low-dimensional subspace, transferring algorithm search into a low-dimensional and physically meaningful subspace. It reduces redundant symmetric representations caused by interchangeable resources, achieving implicit symmetry reduction for redundant equivalent scheduling solutions. Simultaneously, in conjunction with the discrete–continuous separation mechanism, it achieves preliminary decoupling and dimensionality reduction in the variable space. This holds strong practical value for enhancing the DI-WOA’s convergence stability and ability to search for high-quality feasible solutions.
The DI-WOA search is executed in the real-valued vector space through iterative updates of the Whale Optimization Algorithm. Each fitness evaluation undergoes the complete chain: “real-valued vector → discrete genes → scheduling decoding → closed-form frequency and enhanced Lagrangian calculation”. Specifically, an initial instance is first constructed using a random problem generator. Based on this, a heuristic initial gene is generated and encoded via an encoding function to obtain a relatively good starting point. The remaining individuals are then initialized uniformly at random, forming an initial population possessing both diversity and heuristic characteristics. Subsequently, the main loop begins. In generation , for each individual , truncation is applied to ensure it lies within . Then, the decoding function recovers the discrete genes and the complete schedule, calculating OBJ, various cost items, violation degrees, and the enhanced Lagrangian value, thereby selecting the best individual of that generation and its corresponding violation pattern.
In the position update phase, the DI-WOA employs the I-WOA’s two modes, “shrinking encircling prey” and “spiral updating”, and introduces a low-probability Gaussian perturbation on this basis to enhance diversity. Let the global best position vector in generation
be
, and the current individual position be
. With the maximum iteration number
, the linearly decreasing coefficient is:
Then, generate random numbers
and construct:
When
and
, execute shrinking encircling around the current global best solution:
When
and
, explore around a random individual. When
, execute spiral updating towards the best solution:
where
is the spiral coefficient and
takes a value in
. To prevent premature population contraction into a narrow region, the DI-WOA also superimposes a zero-mean Gaussian perturbation with a certain probability after updating and clips the result again to the
interval. Through this design, a loose coupling relationship is formed between the search in the real-valued vector space and the underlying discrete scheduling structure: the WOA is responsible for balancing exploration and exploitation in the encoded space, while the decoder and the enhanced Lagrangian framework ensure the physical reasonableness and feasibility of solutions.
Figure 4 illustrates the complete process wherein three types of continuous vectors obtained after DI-WOA optimization iteration are fed into the decoder for decoding, the capacity constraint violations of each host are tallied to update dual prices, thereby guiding the search direction for the next iteration of the DI-WOA. The three continuous real-value vectors derived from the DI-WOA search are preliminarily inversely mapped into three types of discrete genes via specific rules of the decoding function. These discrete genes are then further decoded to obtain concrete scheduling solutions and the capacity soft constraint violation amounts for each host. Based on the tallied capacity soft constraint violation amounts, the dual prices for hosts with larger violations are increased. The updated dual prices for all hosts are aggregated and sorted, further guiding the update of parameters fed into the enhanced Lagrangian for the next algorithm iteration, thereby dual guiding the search direction for the next iteration of the DI-WOA.
3.5. Complexity
We analyze the computational complexity of the DI-WOA by decomposing one fitness evaluation into its major modules and then aggregating the per-iteration costs. Let J be the number of tasks, H the number of physical hosts, T the number of discrete time slots, P the population size, and G the number of generations. The DI-WOA performs the population search in a real-valued encoded space whose dimension is
D = J (random keys for task ordering) + (preferred host vector) + (flattened core activation ratio), i.e., .
Per-solution (fitness) evaluation. For a candidate vector x, one evaluation consists of:
- (1)
Gene recovery and task ordering. Recovering the discrete genes from x includes sorting the J random keys to obtain a task order, which costs , plus linear-time mappings for the preferred host vector and the core ratio grid, i.e., .
- (2)
Feasible decoding on the host–time grid. The decoder schedules tasks sequentially. In the worst case, each task may test multiple host–slot candidates across the grid (with constant-time feasibility checks such as memory checks and capacity bookkeeping). Therefore, the decoding cost is in the worst case.
- (3)
Closed-form frequency step. Given the decoded allocation and effective core counts, the DVFS frequency is computed with a closed-form expression and clipping for each host–slot cell, which costs .
- (4)
Cost and violation statistics. Bill cost components (idle/dynamic energy, boot–shutdown, migration, and tardiness) are aggregated and the soft-constraint violation map is computed over the grid costs once the decoded schedule is available.
Combining the above, the time complexity of one fitness evaluation can be expressed as
In typical medium-scale settings, the decoding term dominates.
Per-generation complexity. Each generation evaluates P individuals and updates their positions in the D-dimensional encoded space. The WOA position update (including the optional low-probability perturbation) is composed of element-wise vector operations and random number generation, costing
per generation. In addition, the DI-WOA updates the dual variables (e.g., the congestion price heatmap
and penalty coefficient
) once per generation based on the best individual’s violation map, which costs
. Therefore, the per-generation complexity of the DI-WOA is
Total complexity. Over G generations, the total time complexity is
We conducted a comparison with other meta-heuristics. In our experiments, all compared algorithms share the same encoding–decoding and fitness-evaluation pipeline; hence, they share the same dominant evaluation cost . Their overall complexities differ mainly in (i) the number of fitness evaluations per iteration and (ii) the per-iteration position-update overhead in the D-dimensional space. Most population-based methods, such as the WOA/HHO/SMA, evaluate approximately candidates per iteration and perform updates, yielding per iteration. In contrast, ABC-type methods (including IABC) commonly perform both employed-bee and onlooker-bee phases, which can result in roughly fitness evaluations per cycle (implementation-dependent), leading to a higher evaluation-dominated cost. The DI-WOA introduces an extra dual-update step per generation, but this overhead is typically negligible compared with the dominant decoding cost.
5. Conclusions and Future Work
This paper addresses the cost–efficiency trade-off and complex constraint-solving challenges in cloud server computing resource scheduling, proposing a unified monetized single-objective modeling and DI-WOA solution framework. This framework achieves dimensionality reduction via discrete–continuous divide-and-conquer and a closed-form frequency step; simultaneously, the three-layer structure of “discrete genes–real-valued encoding–decoder” better exploits the modeling structure and reduces search variable dimensionality, improving the algorithm’s ability to search for elite feasible solutions.
From a symmetry/asymmetry viewpoint, the unified monetization mechanism provides a scale-consistent objective for cross-scenario comparison, the encoding–decoding structure implicitly reduces redundant symmetric representations induced by interchangeable resources, and the enhanced Lagrange dual-rollback mechanism introduces beneficial asymmetry via adaptive congestion prices to guide feasibility restoration under soft constraints. The experimental results show that under baseline settings, the DI-WOA reduces the final OBJ by 8.33% compared to the baseline WOA; in multi-load generalization experiments, the DI-WOA’s mean OBJ is optimal in all scenarios for H = 3/4, leading the sub-optimal algorithm by up to 13.85%, verifying the effectiveness and stability of the proposed method under the unified bill cost objective. For large-scale H scenarios, hierarchical and symmetry-aware encoding (e.g., canonicalization over homogeneous host groups) and adaptive iteration budget/population size strategies can be utilized to control search difficulties arising from high-dimensional discrete space;
While the DI-WOA demonstrates stable performance on static instances and within limited-scale settings, it is imperative to acknowledge the inherent limitations of this framework. These limitations stem from the unpredictability and variability of real-world application scenarios, representing common shortcomings of heuristic algorithms. Specific limitations include:
The unified monetization mechanism necessitates converting physical quantities of different dimensions into a single “bill cost,” which heavily relies on scenario-specific “conversion rates” and lacks universality for arbitrary contexts;
The objective function modeling assumes static/known pricing signals (e.g., energy/carbon prices), making it difficult to adapt to scenarios with time-varying prices;
The modeling presupposes offline instances with full prior knowledge of all task parameters (e.g., arrival times, deadlines, and workloads), rendering it unsuitable for volatile scenarios involving dynamically arriving or cancelled tasks.
Building upon the aforementioned three shortcomings, future work will focus on developing a general-purpose monetization calibration mechanism adaptable to the cost characteristics of diverse scenarios. Furthermore, it will explore extending offline scheduling to online horizons by integrating techniques from the reinforcement learning domain and neural networks for stochastic price prediction and dynamic task handling.