Next Article in Journal
MFF: A Multimodal Feature Fusion Approach for Encrypted Traffic Classification
Previous Article in Journal
Optimized FPGA Architecture for CNN-Driven Subsurface Geotechnical Defect Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Evolutionary Algorithm for Multi-Objective Workflow Scheduling with Adaptive Dynamic Grouping

by
Guochen Zhang
,
Aolong Zhang
,
Chaoli Sun
* and
Qing Ye
College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(13), 2586; https://doi.org/10.3390/electronics14132586
Submission received: 7 May 2025 / Revised: 19 June 2025 / Accepted: 23 June 2025 / Published: 26 June 2025
(This article belongs to the Section Computer Science & Engineering)

Abstract

For workflow scheduling with complex dependencies in cloud computing environments, existing research predominantly focuses on multi-objective algorithm optimization while neglecting the critical factor of workflow topological structure. The proposed Adaptive Dynamic Grouping (ADG) strategy breaks through this limitation via dual innovative mechanisms: firstly constructing a dynamic variable grouping model based on task dependencies to effectively compress decision space and reduce global search overhead and secondly introducing an adaptive resource allocation strategy that dynamically distributes execution opportunities according to variable groups’ contribution to optimization, accelerating convergence toward the Pareto frontier. The experimental results on five real-world workflows across three major cloud providers’ virtual machines demonstrate ADG’s superior performance in multi-objective optimization, including execution time, cost, and energy consumption, providing an efficient solution for cloud-based workflow scheduling.

1. Introduction

Cloud computing, as an integration of parallel computing, distributed computing, and other advanced technologies, utilizes internet-based virtualization to deliver elastic computing resources, massive data storage, and efficient processing capabilities [1]. This paradigm enables ubiquitous access to computing resources through on-demand services [2], significantly propelling advancements in data analytics, Internet of Things (IoT), and artificial intelligence (AI) applications. Nevertheless, the escalating complexity of computational tasks on cloud platforms has elevated workflow scheduling to a pivotal research and practical challenge. The optimal scheduling of workflow tasks has emerged as a fundamental requirement for maximizing cloud computing system performance.
A typical workflow comprises interdependent computing tasks with complex topological dependencies. The scheduling challenge extends beyond mere execution time optimization to encompass multiple objectives including operational costs and energy efficiency [3]. Current scheduling methodologies predominantly employ heuristic optimization algorithms to derive approximate solutions [4]. However, these algorithms suffer from inherent limitations in generalization and global search capabilities due to their problem-specific design constraints [4]. While multi-objective scheduling algorithms address multiple constraints, their “black-box” approach fails to leverage the inherent task dependencies and structural knowledge embedded within workflows [5]. This fundamental limitation leads to suboptimal search efficiency, particularly when handling problems with high-dimensional decision spaces [5].
While existing decomposition methods effectively partition large-scale optimization problems, they often fail to fully utilize critical inter-task correlation information. This limitation becomes particularly apparent when handling complex task interdependencies, frequently leading to premature convergence to local optima. To address these challenges, this paper introduces an Adaptive Dynamic Grouping (ADG) algorithm for multi-objective workflow scheduling that innovatively incorporates workflow structural knowledge. The proposed solution features three key components: an intelligent grouping strategy that organizes decision variables into functionally cohesive units based on task parallelism and dependencies, a localized optimization approach that confines perturbations to task-specific operations to reduce computational overhead, and an adaptive resource allocation mechanism that dynamically prioritizes high-value variable groups while replacing underperforming ones to accelerate convergence. Comprehensive experiments were conducted across 20 real-world workflows and 12 VM types from major cloud providers (Amazon EC2, Alibaba Cloud, Microsoft Azure). The results demonstrate ADG’s superior performance in simultaneously optimizing execution time, cost, and energy consumption compared to existing scheduling algorithms.

2. Problem Description

This paper formulates the workflow scheduling problem in cloud computing as a multi-objective optimization problem, with a primary focus on optimizing execution time, cost, and energy consumption while taking into full account core components such as the structure of the workflow model during the scheduling process.

2.1. Cloud Computing Workflow Scheduling Model

In cloud computing services, all heterogeneous virtual machines ( V M s ) are represented by the set V = { V 1 , V 2 , , V m } , where V i denotes the i-th virtual machine. Each V M is characterized by four basic attributes: { M i p s , C p u s , P e r c o s t , B a n d w i d t h }, where M i p s and C p u s represent the computing power and the number of CPUs of the V M , respectively; P e r c o s t indicates the rental cost per unit time of the V M ; and B a n d w i d t h refers to the data transmission rate of the V M . The computing performance of V M s is heterogeneous, meaning that the execution time of the same task may vary across different V M s , and the number of tasks that each V M can handle concurrently also differs. Additionally, a task can only be assigned to one V M at a time [6]. An example of the workflow scheduling process on a cloud platform is shown in Figure 1. The scheduling center randomly distributes 10 cloud tasks uploaded by the user to four virtual machines. After all V M s complete their tasks, the execution results are obtained. Among them, virtual machine V M 2 has the longest completion time, so the Makespan of this task scheduling is 10.
A workflow is typically composed of a set of tasks interconnected by constraint dependencies and communication priority relationships, with its task topology usually represented by a directed acyclic graph (DAG) denoted as G = T , E . Here, T is the set of nodes in the DAG, where each node t i \ i n T represents a task ( i being the task number, with a total of n tasks), and E represents the set of constraints between tasks. Tasks with dependency relationships are connected by directed edges—for example, e i , j signifies the constraint between task t i and task t j , meaning task t j can only execute after receiving data transmitted from task t i . The set of predecessor tasks of node t j is denoted as P t j (pointed to by directed edges from predecessor nodes), and the set of successor tasks of node t j is denoted as S t j (pointed to by directed edges from t j to successor nodes). When a task has multiple predecessors, it can only execute after all predecessor tasks are completed and all communication requirements for the task are satisfied. Specifically, a task with no predecessor nodes is called a workflow ingress task, while a task with no successor nodes is called a workflow egress task. In the example shown in Figure 2, when n = 9 , the predecessor task set of task t 6 is P t 6 = { t 1 , t 2 , t 3 } , the successor task set of task t 2 is S t 2 = { t 6 , t 7 } , the workflow ingress task is t 1 , and the workflow egress task is t 9 .

2.2. Optimization Goals

2.2.1. The Calculation of Makespan

In workflow scheduling, the execution time of task t i on virtual machine v j is computed as follows:
C T t i , v j = T L t i M i p s v j
where T L t i denotes the computational workload of task t i (measured in instructions), and Mips v j represents the processing capacity of v j ’s CPU (in millions of instructions per second). The start time of task t i is determined by the completion of all its predecessor tasks:
S T t i , v j = m a x { T T F e i k } , t k p r e t i
where TTF e i k is the total data transfer time from predecessor task t k to t i . The finish time of task t i is the sum of its start and execution times:
F T t i , v j = S T t i , v j + C T t i , v j
When tasks t i and t k reside on different VMs, communication overhead must be considered:
T T i k e i k = 0 ,              i f v t i = v t k ; m i k m a x { b d v t i , b d v t k } ,      O t h e r w i s e
Here, m i k denotes the data volume transferred between tasks, and bd v t i , bd v t k represent the network bandwidths of their respective VMs. The communication time is zero if tasks are collocated; otherwise, it is determined by the data volume and the maximum available bandwidth. The end-to-end communication delay between dependent tasks is as follows:
T T F i k = T T i k e i k + F T t k
where F T t k indicates the execution completion time of the t k .
For scheduling algorithms, especially in the context of cloud computing, the maximum completion time (Makespan) stands as the most fundamental evaluation metric. This metric is defined as the total duration elapsed from the start of the first task in a workflow to the completion of the last task, serving as a critical indicator for assessing the time efficiency of scheduling algorithms. Mathematically, it can be expressed as follows:
M a k e s p a n = m a x t i T { F T t i }

2.2.2. Task Execution Cost

In workflow scheduling, the virtual machine leasing cost is another key metric for users, calculated as
C = j = 1 m p e r v i × A C T v i / l
where per v i denotes the unit leasing price of virtual machine v i , ACT v i is its actual usage duration in hours, l represents the billing cycle (typically 1 h or 1 min), and indicates that partial cycles are rounded up to the next full unit using the ceiling operator.

2.2.3. Energy Consumption for Virtual Machines

The energy consumption of virtual machines (VMs) is modeled based on the framework outlined in Reference [7]. The total power consumption consists of two components: an idle power component and an active power component, as described by Equation (8):
E S j = s t e t A t × P I + λ × f v t 3 d t
where s t and e t denote the power-on time and power-off time of the VM, respectively; A t is a binary state variable representing the operational state of the VM at time t (1 for active state, 0 for idle state); P I is the power consumption per unit time in the idle state; f v t is the CPU frequency of the VM at time t ; and λ is a constant reflecting the relationship between the operating frequency and supply voltage of the VM.
Thus, the total energy consumption of servers on the cloud computing platform can be expressed as follows:
E n e r g y C o s t = j = 1 m E S j
For a task t i , the decision variable x i represents the index of the assigned VM (ranging from 1 to M). For a workflow with N tasks, the corresponding decision variables are collectively defined as X = { x 1 , x 2 , , x N } , and the decision space is denoted as S = M N . Based on the above analysis, the multi-objective workflow scheduling model can be formulated as follows:
M i n i m i z e   f x = { f 1 x , f 2 x , f 3 x } S . t . x { 1,2 , , m } n

3. An Evolutionary Algorithm for Workflow Scheduling with Adaptive Dynamic Grouping

The proposed Adaptive Dynamic Grouping (ADG) strategy deeply analyzes the workflow structure and uses a dynamic decision variable grouping mechanism to cluster large-scale variables based on task dependency relationships. This approach compresses the decision space, reduces problem complexity, and avoids the high cost of global search. The strategy can be embedded into the population selection and evaluation processes of existing multi-objective optimization algorithms to enhance their optimization capabilities. The specific embedding process is described in Algorithm 1.
Algorithm 1 The pseudocode of ADG
1: G ← GroupDecisionVariables
2: Initialize a population P
3: Calculate H V P on the non-dominated solutions of P
4: for g = 1 → |G| do
5:    Get a new population by re-generating values on decision variables of G g for P
6:    P′ ← Regenerate Population on group G g based P
7:    Non—dominate sorting of P and calculate H V P
8:    ∆ C g ← max[ { H V } ^ { P ^ } { H V } ^ P , 0 ]
9: end for
10: while Stop condition is not reached do
11:   G k ← Roulette wheel selection based on ΔC
12:    for  l = 1 → L do
13:        if Δ C k l + Δ C k l 1 = 0 then
14:            G k ← Subdivide( G k )
15:            G k replace G k in G
16:            TC ← 0
17:        end if
18:        Q ← Reproduction( P k )
19:        P E n v i r o n m e n t a l S e l e c t i o n P Q
20:        Update the non-dominated solutions using P
21:        H V P c a l c u l a t e t h e H V v a l u e o f P
22:        Δ C k l H V P H V P ,
23:        H V P H V P
24:        TC = Δ C k l + TC
25:    end for
26:    Δ C k ← max[TC/L,0]
27: end while

3.1. The Encoding Method

Assume that the current task scheduling problem involves six tasks and four virtual machines (VMs). The encoding process is shown in Figure 3, with the following specific mappings: tasks t 1 and t 3 are allocated to VM v 4 , tasks t 2 and t 6 to VM v 1 , task t 4 to VM v 3 , and task t 5 to VM v 2 . Given the complex dependencies among workflow tasks, traditional scheduling methods relying on pre-randomized task sequences are ill-suited. Therefore, this study employs a single-layer encoding strategy. This approach directly encodes task-VM mappings, bypassing redundant task ordering and focusing on the core problem of resource allocation. The proposed task priority scheduling strategy will be detailed in Section 3.4.

3.2. The Framework of ADG

In the context of workflow scheduling on cloud platforms, the coupling characteristics of large-scale task sets and heterogeneous resources often lead to high complexity in scheduling problems, which greatly complicates the search and optimization processes of algorithms[8]. However, the inherent topological structure and task dependency relationships of workflows provide important breakthroughs for problem-solving. A deep analysis of workflow structures reveals that task execution typically exhibits significant locality characteristics—adjusting the scheduling allocation of local tasks usually only affects the execution progress of associated tasks, with minimal impact on the overall system. Based on this, complex workflow systems can be decoupled into independent or weakly coupled subregions, thereby transforming the global optimization problem into several local optimization problems for subregions. This divide-and-conquer strategy enables the analysis of the contribution weights of each group of decision variables to optimization objectives by leveraging historical iteration results. The core idea of the ADG algorithm is illustrated in Algorithm 1.
In Algorithm 1, the problem features are first independently sampled G times to generate G subproblems (line 1), with the decision variable selection details provided in Section 3.3. Subsequently, population P is initialized, and the hypervolume (HV) of its non-dominated solutions is computed (lines 2–3). For each subproblem g , a new population P is regenerated from P by resetting the values of variables selected for g . After evaluating the objective values of P , the non-dominated solution set is updated, and the HV difference between P and the initial population determines the initial contribution Δ C g of subproblem g (lines 4–8).
During the iterative process, subproblem G k is selected using roulette wheel selection (line 10). If G k fails to yield positive contributions for two consecutive cycles, its historical contributions are cleared and replaced by a new subproblem G k ′ selected via roulette wheel selection, followed by the dynamic grouping adjustment strategy (lines 11–16, strategy details in Section 3.3). Otherwise, if G k continues to contribute positively, the algorithm proceeds to the offspring generation process, producing a mixed population Q for environmental selection to derive a new population P (lines 18–19). The non-dominated solution set is then updated, and the HV value is recalculated as the contribution degree Δ C k l (lines 20–23). Finally, the mean contribution degree is computed and assigned as the contribution degree of subproblem G k (line 26).

3.3. Dynamic Decision Variable Grouping Mechanism Based on Workflow Structure Decomposition

The GroupDecisionVariables method decomposes high-dimensional scheduling problems into low-dimensional subspaces by parsing workflow topologies and adopting independent feature sampling, providing a hierarchical solving framework for iterative optimization. As illustrated in Figure 4, for a typical workflow scenario (13 tasks with t 1 as the entry task and t 13 as the exit task), the method first filters decision variables corresponding to tasks without successor nodes, incorporating the exit task t 13 into the initial group Group1. Given that only a single task exists at the current layer depth, the decision variables of upstream tasks t 10 , t 11 , and t 12 are recursively aggregated to form the first subproblem SubProblem1, ensuring effective dimensionality for subproblem optimization.
Subsequently, the remaining tasks T = { t 1 , t 2 , , t 9 } form a new grouping space where the “end-task-first aggregation strategy” is repeated: nodes without successors (e.g., t 7 , t 8 , t 9 ) are prioritized, and their decision variables are merged with preceding groups to construct SubProblem2. This process proceeds recursively via depth-first traversal until all task decision variables are integrated into hierarchical subproblems. By backtracking task dependencies from exit to entry tasks, this grouping strategy naturally aligns with the DAG (directed acyclic graph) structure of workflows, maintaining subproblem independence while ensuring weak coupling between adjacent subproblems—thereby providing structured support for subsequent contribution-based dynamic subproblem selection.
The Subdivide method realizes dynamic grouping adjustment through a dual screening mechanism: first, it identifies the core group with the highest contribution degree from the current grouping set, treating it as an independent workflow unit; then, it employs a top-down hierarchical traversal strategy to extract initial task nodes without predecessor dependencies within this workflow. Through depth-first decomposition, the core group is refined into multiple functional subgroups. The average computational resource requirements (such as CPU cycles, memory usage, etc.) of tasks in each subgroup are calculated, and the subgroup with the highest resource requirement—termed the “key task cluster”—is selected. This cluster corresponds to the decision variable set with the greatest impact on system performance, replacing the current redundant group with the lowest contribution degree to form an optimized subgroup set G . This process iterates collaboratively with the offspring propagation and environmental selection mechanisms in genetic algorithms to continuously optimize the grouping structure.
Figure 5 illustrates the screening process of key subgroups: suppose Group3 is identified as the highest-contribution group in a certain iteration, hierarchical splitting yields three functional subgroups (Group3.1, Group3.2, and Group3.3). By quantitatively analyzing the resource consumption characteristics of tasks (e.g., computational complexity, I/O frequency), Group3.3, with the highest resource intensity, is selected as the new decision variable group to replace the current redundant group with the lowest contribution degree.
Based on the above grouping mechanism, the preliminary task division can be determined. On this basis, the task priority scheduling policy (detailed in Section 3.4) further determines the execution order of tasks within each group, with the two mechanisms working in close collaboration to achieve efficient workflow scheduling.

3.4. Task Priority Scheduling Policy and Group Task-VM Mapping Reproduction Policy

In cloud workflow task scheduling systems, two primary optimization dimensions exist: determining the execution timing relationships among tasks and mapping tasks efficiently to heterogeneous virtual machine (VM) resources. The task-resource mapping problem can be directly modeled using decision variables, while the task execution order is dynamically optimized through the Priority Scheduling Policy proposed in this section. This policy establishes a partial order among intra-group tasks to ensure that when multiple tasks from the same subgroup are assigned to the same VM, their execution sequence strictly adheres to the workflow’s dependency constraints. Specifically, the execution priority R a n k t i , v j of task t i on VM v j is computed recursively using Equation (11):
R a n k t i , v j = S T t i , v j + w i ¯ + m a x t p s u c t i R a n k t p
where S T t i , v j is defined as the maximum data transfer time among all predecessor tasks of t i , this parameter characterizes the earliest feasible start time for task t i on virtual machine v j . Whereas w i ¯ , the average execution time of task t i across the entire VM cluster, is computed as shown in Equation (12)
w i ¯ = j = 1 M T L t i M i p s v j M
Workflow tasks on virtual machines are sorted in ascending order based on the priority values computed by Equation (11). Mathematical derivation shows that the priority value of any task is strictly greater than that of all its predecessor tasks, a property ensuring that the “predecessor-first, successor-later” dependency constraints are automatically satisfied during task scheduling. This priority-based sorting mechanism eliminates the generation of invalid solutions at the algorithm design level, eliminating the need for additional constraint handling overhead.
The scheduling model proposed in this paper centers on balancing three objectives—completion time, execution cost, and energy consumption—adopting a modular design to flexibly integrate with existing multi-objective optimization algorithm frameworks. Take NSGA-III [9] as an example: by embedding the proposed adaptive decision variable grouping mechanism, the algorithm significantly enhances population management efficiency when addressing high-dimensional decision spaces.
To maintain diversity within the evolving population and ensure the correctness of the evolutionary direction, this paper proposes an intra-group task load-balanced mapping strategy integrated with crossover and mutation operators to assist the multi-objective algorithm in efficiently generating offspring during the evolutionary process. Specifically, 50% of the individuals are randomly selected to undergo genetic operations (including binary crossover and polynomial mutation), while the remaining individuals generate offspring through the task load-balanced mapping strategy. This strategy achieves dynamic adaptation of the scheduling scheme to heterogeneous computing environments through a dual mechanism of task load balancing and virtual machine performance adaptation.
The specific execution process of the load-balanced mapping strategy includes the following: first, sorting the tasks within the group in ascending order based on the completion times of their target virtual machines, with their indices denoted as K i i = 1,2 , , m ; extracting the minimum number of tasks among all virtual machines and denoting it as Z ; swapping the Z tasks with the smallest resource consumption on virtual machine v K 1 with the tasks with the largest resource consumption on v K m in a one-to-one manner, while ensuring that the task priority sequence remains unchanged during the exchange process. This iterative process continues until load balancing adjustments are completed for all virtual machines.
The left side of Figure 6 presents the task allocation relationships corresponding to a set of decision variables and the completion time sequences of the target virtual machines. After reordering the virtual machines within the group in ascending order of their completion times, the updated sequence on the right side is obtained. It can be observed that virtual machine v 3 has the fewest tasks Z = 3 , and the following exchange operations are ultimately performed: Virtual machines v 4 , v 2 , and v 3 each select the tasks with the smallest resource consumption to swap with the tasks with the largest resource consumption on virtual machines v 5 , v 6 , and v 1 , respectively. As illustrated in the right figure, the system achieves load balancing through bidirectional task swapping between virtual machines v 4 and v 5 , involving three task pairs ( T 1 T 24 , T 14 T 12 , T 30 T 19 ). The blue and red lines in the diagram indicate the task swaps between the two VMs. This scheduling strategy first exchanges tasks between v 5 (with maximum completion time) and v 4 (with minimum completion time) to reduce the system makespan. Subsequently, it sequentially executes cascade optimization between v 2 and v 6 , v 3 and v 1 , establishing a comprehensive load-balancing closed loop. This example intuitively demonstrates the application process of this strategy in an actual scheduling scenario.

4. Experiment

This section outlines the experimental setup, including the workflow configuration, the parameters and metrics of the real virtual machines, followed by ablation experiments of the proposed algorithm. The performance of the proposed algorithm is compared with four representative algorithms, and the results are analyzed and discussed. The experimental environment was configured using WorkflowSim [10], a cloud workflow simulation platform. All experiments were conducted on a Lenovo computer (Beijing, China) running Windows 11, equipped with a 13th Gen Intel(R) Core(TM) i9-13900HX CPU and 32 GB of Samsung DDR5 5600 MHz memory, Lenovo, Beijing, China.

4.1. Experimental Setup

In the experiments, five types of real-world workflows are employed, including CyberShake [11], Epigenomics [12], Inspiral [13], Montage [14], and SIPHT [15] as illustrated in Figure 7, each representing distinct domains: CyberShake for analyzing seismic hazard-related data, Epigenomics for matching epigenetic states of human cells in biology, Inspiral for gravitational waveform analysis in physics, Montage for constructing customized astronomical sky images in astronomy, and SIPHT as a bioinformatics method for analyzing bacterial small-molecule untranslated regulatory RNAs. Figure 7 depicts the architectures of these workflows, which span diverse application domains and serve as widely adopted benchmarks for evaluating workflow scheduling algorithms, with each containing task sets across four scales. Table 1 lists 12 VM types from leading cloud service providers—Amazon EC2, Alibaba Cloud, and Microsoft Azure—applied in this study’s experiments.

4.2. Ablation Experiment

To validate the effectiveness of each component of the proposed algorithm, four ablation experiments were designed and conducted, focusing on the analysis of decision variable grouping, task priority ranking, and offspring generation mechanisms. The experiments were carried out in four different scales of CyberShake cloud computing workflow scenarios, using the twelve VM types described previously. The specific comparison methods are as follows:
(1)
NGD-NSGAIII: A comparison method without the dynamic grouping mechanism;
(2)
NRP-NSGAIII: A comparison method without the intra-group task priority ranking;
(3)
NVM-NSGAIII: A comparison method without the task mapping strategy for offspring generation;
(4)
ADG-NSGAIII: The complete algorithm proposed in this paper.
Figure 8 presents the convergence curves of the above algorithms for the three optimization objectives—completion time, execution cost, and energy consumption—across four scales of CyberShake workflows.
The experimental comparison results demonstrate that the proposed algorithm comprehensively outperforms other ablation variants: specifically, NGD-NSGAIII (without a dynamic grouping mechanism) exhibits performance comparable to the proposed algorithm under small task scales (30/50 tasks), but its target performance significantly lags behind in large-scale scenarios. NRP-NSGAIII (lacking intra-group task prioritization) shows extremely irregular convergence trends in completion time due to neglecting task dependency constraints, allowing successor tasks to be scheduled before predecessors—this causes prolonged startup waits for predecessor tasks, triggering significant execution delays and leading to severe fluctuations in the convergence curve. In contrast, NVM-NSGAIII (without task mapping strategy for offspring generation) demonstrates stronger randomness in solution searching due to the absence of heuristic guidance. Offspring generated randomly via cross-variable operators often contain numerous invalid or low-quality solutions, requiring more iterations to converge to optimal solutions. This not only significantly slows down the convergence speed but also exerts obvious negative impacts on the scheduling efficiency of cloud computing tasks.

4.3. Algorithm Comparison Experiment

In the experimental validation phase, the proposed Adaptive Dynamic Grouping-based Multi-Objective Optimization Algorithm (ADG-NSGAIII) is benchmarked against four categories of comparison algorithms: (1) Improved Multi-Objective Particle Swarm Optimization (I_MaOPSO) [16], (2) Non-dominated Sorting Genetic Algorithm II (NSGAII) [17], (3) Non-dominated Sorting Genetic Algorithm III (NSGAIII) [9], and (4) Reference Vector Guided Evolutionary Algorithm (RVEA) [18]. Table 2 presents the Hypervolume Indicator (HV) comparison results of these five algorithms across 20 workflow instances (encompassing three optimization objectives: Makespan, execution cost, and energy consumption, with decision variable dimensions ranging from 24 to 1000). Table 3 further showcases the performance metrics of NSGAII, NSGAIII, and RVEA after integrating the proposed Adaptive Dynamic Grouping mechanism, calculated using the PlatEMO Multi-Objective Optimization Platform [19]. Specifically, the metrics include the following: (a) Hypervolume Indicator (HV), (b) 100-iteration time consumption, and (c) Cumulative time consumption to achieve equivalent convergence. Algorithm 1 illustrates the implementation details of the grouping mechanism using NSGAIII as a representative example.
To validate the performance difference between the algorithm using the ADG strategy and the algorithm without ADG, a paired-sample t-test was conducted for statistical analysis. In the experiment, each algorithm was executed 10 times independently for problem scenarios of different scales, and the average values were calculated, generating a total of 12 groups of comparative data. According to Formula (13) (where d ¯ represents the mean of differences and s d represents the standard deviation of differences), the calculated t-value was 2.6547, with a corresponding p-value of 0.0224.
t = d ¯ s d / n
From a statistical perspective, when the significance level is typically set at 0.05, the p-value of 0.0224 is less than 0.05, indicating that the difference between the two groups of data is statistically significant. This result fully demonstrates that the ADG strategy proposed in this paper can significantly enhance the algorithm performance. Its effectiveness has been verified across problem scales of varying sizes, confirming its practical engineering application value.
To validate the execution efficiency of the algorithm, this paper conducts a comparative analysis of the execution times between algorithms with and without the ADG strategy, exploring two key dimensions. Firstly, the time consumption for both algorithms to complete 10 iterations is compared, as presented in Table 4. Secondly, taking the cost results of NSGA-II after 100 iterations as the benchmark, the number of iterations and corresponding time required for other algorithms to achieve the same optimization effect are analyzed, as shown in Table 5. The symbol “-” indicates that the algorithm fails to reach the optimization level of NSGA-II within 100 iterations.
The experimental data reveal that in the 10-iteration scenario, algorithms adopting the ADG strategy generally exhibit higher average execution times than those without it, mainly because the ADG strategy incurs additional computational costs from calculating individual contributions and performing dynamic grouping during the population selection phase. However, in terms of optimization effectiveness, the ADG strategy shows remarkable advantages. Its superiority in reducing iteration numbers is not obvious for small—scale problems, but as the problem scale grows, such as when n = 100, the number of iterations of the algorithm with the ADG strategy significantly decreases. When n = 1000 and the single—iteration computation time substantially increases, the ADG strategy significantly shortens the overall optimization time by reducing the number of iterations, demonstrating particularly striking efficiency improvements. Evidently, the ADG strategy can significantly boost the optimization efficiency of algorithms for large-scale problems, effectively striking a balance between computational overhead and optimization performance.
In addition to the comparison with multi-objective workflow scheduling algorithms, this paper also designed a set of experiments to compare the optimization effects with various classical heuristic algorithms in the field of scheduling in the CyberShake workflow environment. These algorithms include the First-Come-First-Served (FCFS) algorithm [20], Round Robin (RR) algorithm [20], Shortest Job First (SJF) algorithm [20], Min-Min algorithm [21], and Min-Max algorithm [22]. Figure 9 shows the experimental results based on the following three optimization objectives.

5. Conclusions and Future Work

The ADG strategy proposed in this paper addresses the challenges of cloud computing workflow scheduling by deeply mining workflow structure knowledge to achieve efficient grouping of decision variables. The strategy integrates an Adaptive Dynamic Grouping adjustment mechanism to ensure precise allocation of evolutionary opportunities among different subgroups, while adopting a task load-balanced virtual machine mapping strategy for offspring generation, fundamentally enhancing search efficiency and convergence performance. Extensive comparative experiments in real-world workflow scenarios and cloud platform environments demonstrate that the strategy exhibits significant advantages in high-dimensional multi-objective optimization algorithms: compared with algorithms without this strategy, the number of iterations is reduced by nearly 50%, the average execution time is shortened by over 40%, and the t-test verifies its effectiveness and stability.
Despite achieving breakthrough results in workflow scheduling, the algorithm has two main limitations. Firstly, it heavily relies on specific domain knowledge systems; only with a thorough understanding of intra-domain dependencies can task dependencies be leveraged during the design phase for dynamic grouping and optimization, which restricts its direct applicability to large-scale multi-objective optimization problems in other domains, such as financial optimization and logistics scheduling. Secondly, the computational time complexity of the hypervolume (HV) metric increases exponentially with the number of objectives, leading to excessive consumption of computational resources and significantly prolonged processing times. Future research will focus on these two issues by expanding the algorithm’s application scenarios to emerging fields like edge computing, exploring lightweight computational complexity evaluation metrics for integration into the algorithm framework, and continuously enhancing the algorithm’s performance and general adaptability to provide novel solutions for advancing cloud computing task scheduling technologies.

Author Contributions

Conceptualization, G.Z. and C.S.; Software, A.Z.; Data curation, Q.Y.; Writing—original draft, G.Z.; Writing—review & editing, C.S.; Funding acquisition, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (Grant No. 62372319).

Data Availability Statement

The datasets used in this study, CyberShake, Epigenomics, Inspiral, Montage, and SIPHT, were obtained from publicly available sources. The experimental results have been presented in the manuscript, with no additional data generated.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Deb, M.; Choudhury, A. Hybrid cloud: A new paradigm in cloud computing. In Machine Learning Techniques and Analytics for Cloud Security; Wiley: Hoboken, NJ, USA, 2021; pp. 1–23. [Google Scholar]
  2. Al-Dhuraibi, Y.; Paraiso, F.; Djarallah, N.; Merle, P. Elasticity in cloud computing: State of the art and research challenges. IEEE Trans Serv. Comput. 2017, 11, 430–447. [Google Scholar] [CrossRef]
  3. Baldan, F.J.; Ramirez-Gallego, S.; Bergmeir, C.; Herrera, F.; Benitez, J.M. A forecasting methodology for workload forecasting in cloud systems. IEEE Trans Cloud Comput. 2016, 6, 929–941. [Google Scholar] [CrossRef]
  4. Bindu, G.B.; Ramani, K.; Bindu, C.S. Optimized resource scheduling using the meta heuristic algorithm in cloud computing. IAENG Int. J. Comput. Sci. 2020, 47, 360–366. [Google Scholar]
  5. Adhikari, M.; Amgoth, T.; Srirama, S.N. A survey on scheduling strategies for workflows in cloud environment and emerging trends. ACM Comput. Surv. (CSUR) 2019, 52, 1–36. [Google Scholar] [CrossRef]
  6. Song, Y.; Xin, R.; Chen, P.; Zhang, R.; Chen, J.; Zhao, Z. Identifying performance anomalies in fluctuating cloud environments: A robust correlative-GNN-based explainable approach. Future Gener. Comput. Syst. 2023, 145, 77–86. [Google Scholar] [CrossRef]
  7. Xia, Y.; Luo, X.; Jin, T.; Li, J.; Xing, L. A tri-chromosome-based evolutionary algorithm for energy-efficient workflow scheduling in clouds. Swarm Evol. Comput. 2024, 91, 101751. [Google Scholar] [CrossRef]
  8. Zitzler, E.; Thiele, L.; Laumanns, M.; Fonseca, C.M.; Da Fonseca, V.G. Performance assessment of multiobjective optimizers: An analysis and review. IEEE Trans. Evol. Comput. 2003, 7, 117–132. [Google Scholar] [CrossRef]
  9. Deb, K.; Jain, H. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans Evol. Comput. 2013, 18, 577–601. [Google Scholar] [CrossRef]
  10. Chen, W.; Deelman, E. Workflowsim: A toolkit for simulating scientific workflows in distributed environments. In Proceedings of the 2012 IEEE 8th International Conference on E-Science, Chicago, IL, USA, 8–12 October 2012; pp. 1–8. [Google Scholar]
  11. Maechling, P.; Deelman, E.; Zhao, L.; Graves, R.; Mehta, G.; Gupta, N.; Mehringer, J.; Kesselman, C.; Callaghan, S.; Okaya, D.; et al. SCEC CyberShake workflows—Automating probabilistic seismic hazard analysis calculations. In Workflows for e-Science: Scientific Workflows for Grids; Springer: Berlin/Heidelberg, Germany, 2007; pp. 143–163. [Google Scholar]
  12. Li, H.; Ruan, J.; Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18, 1851–1858. [Google Scholar] [CrossRef] [PubMed]
  13. Brown, D.A.; Brady, P.R.; Dietz, A.; Cao, J.; Johnson, B.; McNabb, J. A case study on the use of workflow technologies for scientific analysis: Gravitational wave data analysis. In Workflows for e-Science: Scientific Workflows for Grids; Springer: Berlin/Heidelberg, Germany, 2007; pp. 39–59. [Google Scholar]
  14. Berriman, G.B.; Deelman, E.; Good, J.C.; Jacob, J.C.; Katz, D.S.; Kesselman, C.; Laity, A.C.; Prince, T.A.; Singh, G.; Su, M.H. Montage: A grid-enabled engine for delivering custom science-grade mosaics on demand. In Proceedings of the Optimizing Scientific Return for Astronomy Through Information Technologies, Glasgow, UK, 24–25 June 2004; SPIE: Bellingham, WA, USA, 2004; Volume 5493, pp. 221–232. [Google Scholar]
  15. Livny, J.; Teonadi, H.; Livny, M.; Waldor, M.K. High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs. PLoS ONE 2008, 3, e3197. [Google Scholar] [CrossRef]
  16. Saeedi, S.; Khorsand, R.; Bidgoli, S.G.; Ramezanpour, M. Improved many-objective particle swarm optimization algorithm for scientific workflow scheduling in cloud computing. Comput. Ind. Eng. 2020, 147, 106649. [Google Scholar] [CrossRef]
  17. Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T.A. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
  18. Cheng, R.; Jin, Y.; Olhofer, M.; Sendhoff, B. A reference vector guided evolutionary algorithm for many-objective optimization. IEEE Trans Evol. Comput. 2016, 20, 773–791. [Google Scholar] [CrossRef]
  19. Tian, Y.; Cheng, R.; Zhang, X.; Jin, Y. PlatEMO: A MATLAB platform for evolutionary multi-objective optimization [educational forum]. IEEE Comput. Intell. Mag. 2017, 12, 73–87. [Google Scholar] [CrossRef]
  20. Siahaan, A.P.U. Comparison analysis of CPU scheduling: FCFS, SJF and Round Robin. Int. J. Eng. Dev. Res. 2016, 4, 124–132. [Google Scholar]
  21. Mustapha, S.M.F.D.S.; Gupta, P. Fault aware task scheduling in cloud using min-min and DBSCAN. Internet Things Cyber-Phys. Syst. 2024, 4, 68–76. [Google Scholar] [CrossRef]
  22. Mishra, S.K.; Sahoo, B.; Parida, P.P. Load balancing in cloud computing: A big picture. J. King Saud Univ.-Comput. Inf. Sci. 2020, 32, 149–158. [Google Scholar] [CrossRef]
Figure 1. An example of workflow schedule model.
Figure 1. An example of workflow schedule model.
Electronics 14 02586 g001
Figure 2. Example of workflow model.
Figure 2. Example of workflow model.
Electronics 14 02586 g002
Figure 3. Method of encoding example.
Figure 3. Method of encoding example.
Electronics 14 02586 g003
Figure 4. Example of grouping workflow tasks.
Figure 4. Example of grouping workflow tasks.
Electronics 14 02586 g004
Figure 5. Example of adaptive dynamic packet adjustment strategy.
Figure 5. Example of adaptive dynamic packet adjustment strategy.
Electronics 14 02586 g005
Figure 6. Example of task balancing mapping policy.
Figure 6. Example of task balancing mapping policy.
Electronics 14 02586 g006
Figure 7. Structures of five real-world workflows.
Figure 7. Structures of five real-world workflows.
Electronics 14 02586 g007
Figure 8. (a) Convergence curves for Makespan in different CyberShake workflows; (b) Convergence curves for Cost in the ablation experiment; (c) Convergence curves for EnergyCost in the ablation experiment.
Figure 8. (a) Convergence curves for Makespan in different CyberShake workflows; (b) Convergence curves for Cost in the ablation experiment; (c) Convergence curves for EnergyCost in the ablation experiment.
Electronics 14 02586 g008aElectronics 14 02586 g008b
Figure 9. (a) The result of Makespan is compared with the heuristic algorithm; (b) The result of Cost is compared with the heuristic algorithm; (c) The result of Makespan is compared with the heuristic algorithm.
Figure 9. (a) The result of Makespan is compared with the heuristic algorithm; (b) The result of Cost is compared with the heuristic algorithm; (c) The result of Makespan is compared with the heuristic algorithm.
Electronics 14 02586 g009
Table 1. Attribute range of the virtual machine.
Table 1. Attribute range of the virtual machine.
TypesMips (MB/s)CPUsPerCost ($)Bandwidth (MB)
EC2.S51210.043512
EC2.M102410.086768
EC2.L204820.1741280
EC2.XL204840.3502560
Alibaba Cloud.S102420.0471280
Alibaba Cloud.M102420.3511280
Alibaba Cloud.L204840.0502048
Alibaba Cloud.XL512080.2572048
Azure.S76810.0961024
Azure.M128020.1922048
Azure.L256040.3831640
Azure.XL307280.7663072
Table 2. Comparison results concerning metric HV.
Table 2. Comparison results concerning metric HV.
AlgorithmI_MaOPSO
+/−/≈
NSGAII
+/−/≈
NSGAIII
+/−/≈
RVEA
+/−/≈
20 Workflows4/9/72/10/83/13/41/11/8
Table 3. Comparison results concerning metric HV value.
Table 3. Comparison results concerning metric HV value.
CyberShakenNSGAIINSGAIIIRVEA
With ADG309.1860 × 10−19.2081 × 10−18.2873 × 10−1
509.2981 × 10−19.3005 × 10−18.6430 × 10−1
1007.8492 × 10−18.6712 × 10−18.2984 × 10−1
10007.3192 × 10−17.7946 × 10−15.5573 × 10−1
Without ADG308.4348 × 10−18.4719 × 10−19.1223 × 10−1
508.7973 × 10−18.7298 × 10−19.1858 × 10−1
1007.8538 × 10−17.8348 × 10−17.6878 × 10−1
10006.3113 × 10−16.1608 × 10−14.4648 × 10−1
Table 4. The average execution time over 10 iterations.
Table 4. The average execution time over 10 iterations.
CyberShakenNSGAIINSGAIIIRVEA
WithADG303.8 s4.7 s4.3 s
506.6 s10.0 s11.2 s
10014.3 s30.5 s23.5 s
1000386.2 s2949.2 s1322.6 s
WithOutADG302.1 s3.3 s3.2 s
503.6 s7.5 s7.6 s
1008.6 s25.5 s21.1 s
1000326.3 s2898.5 s1021.3 s
Table 5. Comparison of the number of iterations when achieving the same optimization effect.
Table 5. Comparison of the number of iterations when achieving the same optimization effect.
CyberShakenNSGAIINSGAIIIRVEA
WithOutADG 30100 times--
50100 times87 times86 times
100100 times76 times-
1000100 times53 times93 times
WithADG3087 times91 times88 times
5073 times82 times80 times
10052 times46 times68 times
100043 times32 times57 times
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, G.; Zhang, A.; Sun, C.; Ye, Q. An Evolutionary Algorithm for Multi-Objective Workflow Scheduling with Adaptive Dynamic Grouping. Electronics 2025, 14, 2586. https://doi.org/10.3390/electronics14132586

AMA Style

Zhang G, Zhang A, Sun C, Ye Q. An Evolutionary Algorithm for Multi-Objective Workflow Scheduling with Adaptive Dynamic Grouping. Electronics. 2025; 14(13):2586. https://doi.org/10.3390/electronics14132586

Chicago/Turabian Style

Zhang, Guochen, Aolong Zhang, Chaoli Sun, and Qing Ye. 2025. "An Evolutionary Algorithm for Multi-Objective Workflow Scheduling with Adaptive Dynamic Grouping" Electronics 14, no. 13: 2586. https://doi.org/10.3390/electronics14132586

APA Style

Zhang, G., Zhang, A., Sun, C., & Ye, Q. (2025). An Evolutionary Algorithm for Multi-Objective Workflow Scheduling with Adaptive Dynamic Grouping. Electronics, 14(13), 2586. https://doi.org/10.3390/electronics14132586

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop