An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots

Choi, Byoungho; Kim, Minkyu; Kim, Heungseob

doi:10.3390/math13111770

Open AccessArticle

An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots

by

Byoungho Choi

,

Minkyu Kim

and

Heungseob Kim

^*

Department of Smart Manufacturing Engineering, Changwon National University, Changwon-si 51140, Republic of Korea

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(11), 1770; https://doi.org/10.3390/math13111770

Submission received: 23 April 2025 / Revised: 19 May 2025 / Accepted: 24 May 2025 / Published: 26 May 2025

(This article belongs to the Special Issue Mathematical Programming, Optimization and Operations Research)

Download

Browse Figures

Versions Notes

Abstract

This study addresses the multi-robot task allocation (MRTA) problem for logistics robots operating in zone-picking warehouse environments. With the rapid growth of e-commerce and the Fourth Industrial Revolution, logistics robots are increasingly deployed to manage high-volume order fulfillment. However, efficiently assigning tasks to multiple robots is a complex and computationally intensive problem. To address this, we propose a five-step optimization framework that reduces computation time while maintaining practical applicability. The first step calculates and stores distances and paths between product locations using the A* algorithm, enabling reuse in subsequent computations. The second step performs hierarchical clustering of orders based on spatial similarity and capacity constraints to reduce the problem size. In the third step, the traveling salesman problem (TSP) is formulated to determine the optimal execution sequence within each cluster. The fourth step uses a mixed integer linear programming (MILP) model to allocate clusters to robots while minimizing the overall makespan. Finally, the fifth step incorporates battery constraints by optimizing the task sequence and partial charging schedule for each robot. Numerical experiments were conducted using up to 1000 orders and 100 robots, and the results confirmed that the proposed method is scalable and effective for large-scale scenarios.

Keywords:

logistics robots; multi-robot task allocation (MRTA); zone-picking warehouse; task allocation and scheduling

MSC:

90B06

1. Introduction

1.1. Logistics Robot Trends

The rapid advancement of information and communication technologies, the widespread use of the Internet and mobile devices, and the impact of the COVID-19 pandemic have significantly increased the volume of e-commerce transactions and deliveries. In response, the logistics industry is evolving through the adoption of Fourth Industrial Revolution technologies such as artificial intelligence (AI), the Internet of Things (IoT), and big data. This transformation, known as Logistics 4.0, represents a shift from offline-centered commerce to online models (O2O, Online to Offline), a transition from mass transportation to customized logistics, and the expansion and digitalization of logistics warehouses and systems [1,2]. The logistics robot industry is expanding in parallel, driven by declining labor availability due to low birth rates and population aging, rising labor costs, and growing concerns over worker safety [3]. Logistics robots automate tasks such as transportation, sorting, and loading by integrating environmental perception, autonomous driving, and IoT-based scheduling technologies in logistics centers and factories. These robots are being actively adopted to improve operational efficiency. Prominent companies like Amazon and Alibaba have already implemented large fleets of robots to automate warehouse operations. In Korea, more than 1000 logistics robots were deployed at Coupang’s fulfillment center in 2023 to handle product transportation and sorting. Large-scale smart factories and logistics hubs now operate multiple logistics robots simultaneously to manage high-volume workloads. Additionally, the increasing adoption of Robot-as-a-Service (RaaS) models is lowering entry barriers for small and medium-sized enterprises (SMEs) by minimizing initial investment costs and offering scalable automation solutions. This trend is expected to accelerate market growth. As shown in Figure 1, the global logistics robot market was valued at approximately USD 8.78 billion in 2023 and is projected to grow at a compound annual growth rate (CAGR) of over 16.24%, reaching around USD 39.55 billion by 2033 [4]. In the Asia–Pacific region, automation driven by urbanization and industrialization accounts for the largest market share. Countries such as China, Korea, Japan, and Singapore are leading advancements in logistics robotics, with China receiving global attention for its expansive e-commerce sector and modernized logistics infrastructure. Furthermore, the use of logistics robots is expanding beyond traditional warehouses to include applications in hospitals, restaurants, and hotels.

Order-picking, which involves retrieving inventory items to fulfill customer orders, remains the most labor-intensive and costly task in logistics centers. During this process, workers spend considerable time moving between items, prompting the implementation of automation systems in which robots handle transportation tasks while workers remain stationary. This reduces unnecessary movement, increases picking efficiency, and enhances working conditions. To support such advancements, it is crucial not only to improve core robot technologies such as mobility, payload, and control systems, but also to develop an integrated control system for managing multiple logistics robots. This system must coordinate task allocation and scheduling while interfacing with upper-level systems such as enterprise resource planning (ERP), manufacturing execution systems (MES), and warehouse management systems (WMS). Importantly, control system technology can be applied regardless of robot type or operating environment, including factories and logistics centers. Accordingly, this study focuses on control system technologies related to path planning, task allocation, and charging operations for logistics robots.

1.2. Introduction to Logistics Robots and Control Systems

Logistics robots can be broadly classified into automated guided vehicles (AGVs) and autonomous mobile robots (AMRs) depending on their method of navigation. AGVs move along predefined routes marked by QR codes or barcodes installed on the floors or ceilings of logistics centers. These guide markers form a grid map of the facility, and AGVs follow designated paths between the grid points. Because the routes are preassigned, AGVs can travel at high speeds. However, they are unable to deviate from the specified paths. AGVs generally offer higher payload capacity than AMRs and exhibit excellent positioning accuracy due to their guided operation. For these reasons, AGVs are commonly used in large-scale factories and distribution centers that process high volumes of diverse goods. In contrast, AMRs use sensors and cameras to perceive their surrounding environment and navigate autonomously without relying on predefined guide paths. This allows them to avoid obstacles and respond flexibly to environmental changes. Although their navigation takes more time compared to AGVs, and their speed and payload capacity have traditionally been lower, recent technological advances have reduced this gap. AMRs are increasingly adopted because they do not require infrastructure installation and offer high flexibility. They are especially suitable for work environments where robot travel paths change frequently or where installing physical guidance infrastructure is difficult, such as in small-scale warehouses. Additionally, AMRs are used in wide-area workspaces. As intelligent logistics systems become more complex—requiring customization and inventory optimization—automation using flexible AMRs is considered a cost-effective mid- to long-term solution.

To efficiently operate multiple logistics robots, an integrated AGV/AMR control system (ACS) is employed. The ACS interfaces with upper-level systems such as enterprise resource planning (ERP), manufacturing execution systems (MES), and warehouse management systems (WMS). It receives work orders (e.g., picking and delivery), reports the outcomes (e.g., picking and packing completion), and monitors the operational status of each robot and the control system itself. In our proposed framework, the ACS receives order-related data—such as item types, quantities, and storage locations—from the WMS, and broader operational information—such as inventory levels and order priorities—from the ERP system. These upper-level systems do not determine task sequences or assign robots; rather, they serve as data sources for the optimization process. The proposed framework independently performs clustering, sequencing, robot assignment, and charging-aware scheduling based on this input. Execution results, such as task completion times and battery usage, are reported back to the ACS, enabling real-time monitoring and feedback integration with WMS and ERP systems. This architecture allows seamless integration with existing logistics infrastructures while maintaining autonomy over robot-level decision making. The ACS also includes a robot management module that monitors robot health and charging status, as well as a process management module that tracks the progress of assigned tasks. Its core functionalities include:

Path planning: Computing efficient travel routes between a robot’s current location and its destination.
Task allocation: Assigning tasks from upper-level systems to individual robots based on their current state.
Traffic control: Predicting potential collisions or congested zones and preventing jams or bottlenecks to minimize robot downtime.
Charging management: Issuing charging commands based on the robot’s battery status and operational schedule.

The concept and main functions of the ACS are illustrated in Table 1. In this study, traffic control is excluded to simplify the problem, as the onboard collision-avoidance capabilities of modern robots are expected to continue improving with technological advances. Accordingly, this research focuses on designing a control system that manages picking-related path planning, task allocation, and charging schedules, while integrating with upper systems to reflect customer orders and product delivery information.

1.3. Introduction to Order Picking and Its Methods

Order picking refers to the process of retrieving products from storage locations in response to customer orders. This involves identifying and locating items based on order details such as quantity, type, and specific customer requirements. Order picking is the most labor-intensive operation in logistics centers and accounts for more than 50% of their total operating costs. As such, improving the productivity of picking operations has been considered a key priority in reducing logistics costs and enhancing overall efficiency [5].

Order picking productivity can be improved in several ways: by shortening lead times through faster processing and reduced travel distances, minimizing errors to improve order accuracy, optimizing inventory levels to prevent shortages and overstocks via integration with warehouse management systems and demand forecasting, and adopting flexible operating strategies to handle fluctuations in demand. Selecting an appropriate picking method based on warehouse layout, product variety and volume, and order frequency is also essential.

In general, picking methods are categorized into goods-to-person (GTP) and picker-to-parts (PTP) types. In the GTP approach—also called parts-to-person—products are delivered to stationary workers. In modern automated warehouses, items are stored on mobile racks, which are transported to the picking station. A representative example is the robotic mobile fulfillment system (RMFS) used by companies such as Amazon and Coupang (Figure 2). The RMFS offers full automation using robots and is highly scalable, allowing for flexible warehouse expansion or reconfiguration. It also adapts easily to changing customer orders and product placement patterns, such as those driven by seasonality or trends. However, RMFS requires additional infrastructure investment to install storage rack movement paths. Moreover, because items for a single order may be distributed across multiple shelves, follow-up tasks such as sorting and packing are typically required after picking.

The PTP method—also referred to as person-to-parts—is the most traditional form of order picking. Workers physically move to storage locations to collect items. Manual picking offers high operational flexibility and is effective even without digital systems, allowing workers to handle complex scenarios intuitively. However, with rising e-commerce demand and a shrinking labor force, this method struggles to meet the speed and responsiveness needed in modern logistics. Accordingly, robot-assisted picking methods have emerged [7,8]. In such systems, AMRs or AGVs work alongside human pickers, helping transport heavy carts and reducing unnecessary walking. This improves both efficiency and worker ergonomics.

In zone-picking, the warehouse is divided into distinct zones, each managed by a dedicated worker. As shown in Figure 3, robots transport goods to the appropriate zones based on order information, and the assigned worker performs picking only within their designated area. This method eliminates the need for centralized shelf-moving infrastructure, lowering costs. Workers in specific zones become familiar with local inventory, increasing picking speed and reducing errors. It also minimizes worker movement, and because items are picked sequentially per order, post-picking sorting is often unnecessary.

Other picking strategies include:

Individual picking: Picking one order at a time; simple but inefficient due to high travel time. Used in industries with specialized items.
Batch picking: Grouping multiple orders to collect shared items together, often used in the RMFS.
Cluster picking: Grouping orders with overlapping item locations to improve efficiency; one worker processes multiple orders at once.
Wave picking: Grouping and processing orders in scheduled time intervals, useful for handling real-time e-commerce orders.

In practice, logistics centers often adopt hybrid strategies, tailoring picking methods to warehouse conditions and operational goals. The overarching objective of all picking methods is to reduce unnecessary movement and maximize picking efficiency.

1.4. Research Distinctiveness and Purpose

As discussed in Section 1.3, the zone-picking method is widely used in practice due to its operational efficiency and minimal infrastructure requirements. However, despite its practical advantages, zone-picking has received relatively little attention in academic research on multi-robot task allocation (MRTA). Most existing studies have focused on robotic mobile fulfillment systems (RMFS), which are based on the goods-to-person (GTP) model. In RMFS, products are stored on mobile shelves, and robots deliver them to workers at designated picking stations. Since multiple items for a single order are often distributed across separate shelves, orders are decomposed into individual tasks, which classifies the problem as a single-task single-robot (ST-SR) type according to the taxonomy proposed by Gerkey and Matarić [9]. However, this study was conducted on a logistics center using the zone-picking method, which is popular but has not been studied enough. The zone-picking method divides the area where products are loaded into several zones, and assigns workers to each zone to pick products in the corresponding zone. Robots sequentially move to the locations where products corresponding to orders are located, and workers are assigned to their zones and only perform picking tasks. Therefore, it is classified as an MT-SR problem in which multiple products corresponding to one order are handled by one robot.

Single-task robot (ST): Robots perform only one task at a time.
Multi-task robot (MT): Robot performs multiple tasks at a time.
Single-robot task (SR): One robot is required to perform a task.
Multi-robot task (MR): Two or more robots are required to perform a task.

In order to process a large volume of goods, it is important to derive efficient results within a reasonable time. However, the MRTA problem is a considerably complex problem (NP-hard), and it takes a considerable amount of time to derive an optimal solution. Therefore, existing studies mainly solve the problem through empirical methods or reinforcement learning. These methods are vulnerable to local optimization, and the optimization results may be unstable. In the case of reinforcement learning, MDP, a probabilistic method, is defined to assign tasks in real time so that rewards can be maximized, considering dynamic situations such as e-commerce, where customer orders come in in real time. Accordingly, prior information about the environment and orders for a specific warehouse and a lot of data for learning are required. On the other hand, it may be difficult to apply in a new environment, and there are disadvantages such as a lot of time being required to build data or difficulty in interpreting the results. Therefore, in this study, we propose a framework divided into five steps to reduce the size and search range of the problem for the multiple robot task assignment problem, shorten the calculation time, and derive efficient results. The structure of the paper is as follows: Section 2 reviews the related literature; Section 3 introduces the problem definition, logistics center environment, and proposed framework; Section 4 explains the results of the numerical experiments; and Section 5 presents the conclusion and future research directions.

2. Literature Review

2.1. Multi-Robot Task Allocation in Distribution Centers

As logistics centers scale up, the multi-robot task allocation (MRTA) problem has gained increasing importance and has become a widely studied topic. Yuan et al. [6] formulated a model that evaluates time and cost in order handling, considering task correlation within robotic mobile fulfillment systems (RMFS). They ensured that tasks located on the same shelf were assigned to a single robot and minimized the makespan by balancing workloads across picking stations. Agrawal et al. [10] proposed a Markov decision process (MDP)-based model to minimize travel delays in warehouse environments. They solved the task allocation problem using a novel deep multi-agent reinforcement learning architecture inspired by attention mechanisms. Sarkar et al. [11] introduced a nearest-neighbor-based clustering approach for MRTA. Seo [2] developed a task assignment method based on a multi-depot traveling salesman problem (TSP), using the A* algorithm to evaluate travel cost between waypoints and applying an Independent Deep Q-Network (IDQN) to perform collision-free path planning in dynamic, multi-robot warehouse environments. Yuan et al. [12] enhanced DQN-based task allocation by proposing an improved algorithm using shared utilitarian selection and prioritized sampling to accelerate convergence. Oh [13] presented a path-planning method that accounts for robot size and environmental uncertainty, aiming to generate collision-free paths in RMFS environments. Chen et al. [14] proposed a windowed hierarchical cooperative A* algorithm that improves multi-robot path planning efficiency by reusing path data and considering rotation factors in intelligent warehouses. Yang et al. [15] employed a cooperative MRTA approach using a genetic algorithm based on max–min fitness and individual-relative evaluation to efficiently schedule multiple robots for distributed tasks. Shetty et al. [16] formulated a vehicle routing problem (VRP) for optimizing order-picking routes by selecting orders to minimize total travel time and distance. They validated their approach through simulation and statistical analysis, comparing it to traditional heuristics such as S-shape and return methods, focusing on travel efficiency and workload balancing. Žulj et al. [17] developed an AMR-assisted order batching and sequencing method for picker-to-parts warehouse environments. They applied a two-stage heuristic combining adaptive large Nnighborhood search (ALNS) and the Nawaz–Enscore–Ham (NEH) heuristic, evaluating its performance through simulation-based experiments. Table 2 summarizes representative MRTA studies in warehouse environments, highlighting the predominance of RMFS-based ST-SR approaches. In contrast, this study focuses on MT-SR problems within zone-picking systems—an area that has received relatively limited attention.

2.2. Multi-Robot Task Allocation in Various Environments

Beyond warehouse environments, MRTA problems have also been extensively studied in dynamic, uncertain, and high-risk settings. Martin et al. [18] proposed a cooperative game theory-based approach to cluster tasks and robots by considering parameters such as the distance between randomly placed tasks and robots, battery levels, task priorities, and time windows. The objective was to enhance thermal power plant performance through optimized irradiance measurement. Tihanyi et al. [19] formulated an MDP for task allocation under probabilistically evolving risks, such as obstacles, fires, and toxic contamination, in uncertain environments, addressing the diffusion of risks over time. Paul et al. [20] introduced a graph-based reinforcement learning framework using a capsule attention mechanism (CapAM) for MRTA involving fixed-deadline tasks and limited robot capacities. Shibata et al. [21] developed an MDP-based reinforcement learning method that selects the most suitable task among several options, considering differing task weights, and executes it within the shortest time. Park et al. [22] proposed a novel MDP formulation to address scalability challenges when the number of robots or tasks increases. They introduced a deep reinforcement learning algorithm with a cross-attention mechanism to model robot interactions and compute task preferences. Cai [23] addressed human-in-the-loop MRTA by allocating operator support in multi-robot systems. In this framework, each robot independently performs its own tasks but can complete them more efficiently with operator assistance. The study proposed an algorithm to identify blocking tasks in greedy schedules, using both deterministic and probabilistic models of worker support to minimize makespan. Hussein and Khamis [24] proposed an auction-based algorithm to optimally assign heterogeneous robots to tasks, focusing on a utility function that measures agent-task suitability. Choi et al. [25] extended consensus-based auction algorithms by introducing a consensus-based bundle algorithm to solve multi-allocation problems. Their approach ensures convergence to conflict-free solutions through local communication and a conflict resolution mechanism, assuming reasonable evaluation criteria. Dimming et al. [26] proposed the use of dynamically weighted topology graphs to guide multi-robot teams in complex environments to achieve collaborative goals. Hong et al. [27] introduced a Q-learning-based path assignment algorithm for vehicle scheduling in overhead hoist transport systems used in semiconductor wafer manufacturing. Chung et al. [28] developed a task sequencing method to efficiently assign tasks to multiple robots in chassis assembly processes, accounting for task locations and potential collisions. Ghassemi and Chowdhury [29] proposed an online scheduling algorithm based on bipartite graph construction, addressing multi-tour MRTA problems with task, range, and payload constraints. Shelkamy et al. [30] compared genetic algorithm (GA) and ant colony optimization (ACO) approaches to identify suitable solvers for MRTA problems depending on problem constraints and solution space characteristics. Liu et al. [31] proposed a time-extended MRTA approach using ACO for large-scale cooperative tasks with pre-existing constraints. Kong et al. [32] introduced an improved PSO–Greedy hybrid algorithm (IPSO-G), which identifies task-robot pairs using particle swarm optimization (PSO) and determines execution sequences via a greedy heuristic to minimize total cost. While prior studies primarily focus on ST-SR scenarios or are limited to highly specialized domains, few have addressed the challenge of real-time multi-task execution in logistics environments. This study aims to fill that gap by formulating the MT-SR problem in zone-picking systems and proposing a scalable optimization framework tailored to these settings. Table 3 presents previous MRTA research conducted in various environments.

3. Task Scheduling Framework

3.1. Problem Definition and Framework

This study is based on a zone-picking logistics center environment. The logistics center is modeled as a 60 × 60 grid, where logistics robots determine their positions in real time using QR codes embedded in each grid cell. Each grid cell is sufficiently large to accommodate multiple robots, so collision and congestion are not considered. Robots move at a constant speed, excluding acceleration or deceleration due to starting or stopping. Battery consumption is assumed to be proportional to the distance traveled, and each robot begins with a full battery before task assignment. The model excludes worker quantity and efficiency variations in storage areas. It assumes all tasks in a given area can be processed simultaneously, regardless of concentration. Due to the absence of publicly available benchmark data for MRTA in zone-picking settings, experimental orders were generated randomly. Each order contains between 1 and 8 subtasks, representing individual product units defined by location. Each task also includes associated picking time and capacity constraints, and all tasks are generated around storage locations. In this section, we demonstrate the proposed framework using 50 orders and 3 robots as examples.

The proposed scheduling framework consists of five key steps:

Step 1: Distance and path calculation

In the static environment of a logistics warehouse, the A* algorithm was used to efficiently calculate distances and movement paths between products. The search method was designed to store computed paths incrementally in an array, allowing reuse when the same pair of locations reappears, thereby reducing computation time.

Step 2: Order clustering

A hierarchical clustering algorithm was applied, considering both the distance between order centers and robot capacity constraints. This reduced the complexity of the subsequent task assignment step by grouping similar orders and limiting cluster sizes.

Step 3: Intra-cluster task sequencing

Within each cluster, a mixed integer linear programming (MILP) model was formulated based on the traveling salesman problem (TSP). This model optimized the task execution order and movement paths to minimize both travel cost and execution time.

Step 4: Robot-level task allocation

Tasks were allocated to individual robots based on the execution time of each cluster (from the TSP result). The goal was to minimize the makespan—i.e., the time required for all robots to complete their assigned tasks—by ensuring balanced task distribution.

Step 5: Charging-aware task scheduling

Finally, for each robot, the task execution sequence was adjusted considering battery levels. The framework computed when charging would be needed and how much charge would be required to minimize makespan while ensuring continuous operation.

These five steps are summarized and visualized in Figure 4, which illustrates the overall task scheduling framework for multiple logistics robots.

3.2. Path Planning Algorithm

Path planning algorithms have been widely studied across various domains—including robotics, unmanned aerial vehicles (UAVs), and drones—as well as in both static and dynamic environments [33,34]. These algorithms aim to compute an optimal travel path from a given starting point to a destination while avoiding collisions and minimizing both travel cost and time. Representative approaches include graph-based search algorithms such as Dijkstra, A*, and D*, and heuristic-based intelligent search algorithms such as genetic algorithm (GA), ant colony optimization (ACO), and particle swarm optimization (PSO) [35]. In this study, we adopt the A* algorithm for path planning in static environments such as logistics centers. The A* algorithm is known for its fast search capability, which combines breadth-first and depth-first search with heuristic evaluation. It guarantees optimality when an admissible heuristic is used. To support early-stage implementation, we designed a structure that allows incremental filling of the distance array, so even when no precomputed distances are available, the system remains functional. Once the distance between locations is calculated, it is stored to prevent redundant computations, significantly improving overall framework efficiency. The A* algorithm is a graph-based method similar to Dijkstra’s algorithm, but it achieves faster pathfinding through the use of a heuristic. It determines the cost of reaching a node

n

using the evaluation function:

f (n) = g (n) + h (n)

. Herein,

g (n)

is the actual path cost from the starting node to the current node

n

, and

h (n)

is the heuristic estimated cost from the current node to the goal node. To search, add the starting node to the open list and compute

f (n)

. Set the node with the smallest

f (n)

in the open list as the current node and add it to the closed list. Update the minimum cost

f (n)

for the surrounding nodes that are not in the closed list, set the current node as the parent node, and add it to the open list. Again, set the node with the smallest

f (n)

in the open list as the current node, and search for the shortest path by repeating until the goal node is reached. In this study, the A* algorithm is not only used to compute the shortest distance but also to generate detailed movement trajectories for logistics robots. These trajectories represent complete travel paths between subtasks, enabling the robots to follow step-by-step instructions without requiring further path calculation during execution. Algorithm 1 presents the flowchart of the A* algorithm. Figure 5 illustrates an example of a movement path generated by this algorithm for navigating between products. Combined with the TSP-based sequencing in Section 3.4, the framework provides each robot with an optimized, continuous trajectory from start to finish, ensuring direct applicability to robot-level motion planning and control.

Algorithm 1 A* algorithm flowchart.
1.	Define the start node $s$ and target node $t$ .
2.	Add $s$ to the open list and compute its $f (n)$ .
3.	Select the node with the smallest $f (n)$ from the open list as the current node and move it to the closed list.
4.	For each neighboring node not in the closed list, update its $f (n)$ , set the current node as its parent, and add it to the open list.
5.	Repeat steps 3–4 until the goal node is selected as the current node.

3.3. Clustering

Multi-robot task allocation (MRTA) is an NP-hard problem, and obtaining an optimal solution typically requires significant computation time. In logistics centers, where the volume of tasks is large, it becomes impractical to guarantee optimality within acceptable time constraints. Thus, a trade-off between solution quality and computation time is necessary to enable efficient task allocation. To reduce the problem size and search space, this study applies clustering to group customer orders. Doing so significantly reduces the computational burden, allowing task assignment to be completed within a reasonable timeframe. We use hierarchical clustering, which does not require specifying the number of clusters in advance. Each order comprises multiple subtasks, corresponding to individual products, and all subtasks within an order must be performed by a single robot. To group orders with spatial proximity, we compute the Euclidean distance between the center points of the subtasks in each order and apply the single linkage method. In addition to spatial proximity, we incorporate a capacity constraint to ensure that the total volume of tasks in a cluster does not exceed the maximum payload of a robot. Algorithm 2 presents the pseudocode of the clustering procedure. Table 4 lists all 50 orders, including their subtasks, the computed center point coordinates, and total capacities. For example, Order ID 0 includes Subtask IDs 1034 and 2827, has a Center Point at (32, 10), and a total Capacity of 0.09. Table 5 summarizes the clustering results, showing which orders are grouped together in each cluster and the corresponding total capacity. For instance, Cluster ID 0 includes Order IDs 2, 32, and 40, with a combined Total Capacity of 0.91. Figure 6 visualizes the clustering performance, where each color denotes an order and each star indicates the center point of an order.

Algorithm 2 Capacity-Constrained Hierarchical Clustering
1.	Input:
2.	WorksetNum: Order
3.	Time: Working time
4.	Capacity: Order Capacity
5.	Distance: Distance between
6.	Output: Clusters

7.	While min(Capacity) + nextMin(Capacity) ≤ 1 do
8.	P `←` pairs with combined capacity ≤ 1
9.	minDistance `←` ∞
10.	For all pairs (a, b) in P do
11.	If Distance[a][b] < minDistance then
12.	minDistance `←` Distance[a][b]
13.	A, B `←` a, b
14.	End if
15.	End for
16.	Merge clusters A and B
17.	Update cluster list, delete merged entries
18.	Recalculate distances using single linkage
19.	End while

3.4. Search for Optimal Work Order Within a Cluster

Once customer orders have been clustered, each cluster is assigned to a robot such that the robot can operate within its maximum payload capacity. Each robot departs from the packing station, performs picking tasks for all products in the assigned cluster, and then returns to the packing station to deliver the collected items. All tasks are defined by product locations, requiring the robot to visit each corresponding location. This process can be modeled as a graph, where each node represents a task, and each arc denotes the travel distance between tasks. To determine the optimal visiting order within a cluster, the problem is formulated as a traveling salesman problem (TSP). TSP is the classical optimization problem of finding the shortest path that starts from a given point, visits each node exactly once, and returns to the starting point. In this study, the packing station serves as both the starting and ending node. The robot travels from the packing station, executes the picking tasks for subtasks within its assigned cluster, and returns to the packing location. This process is mathematically formulated as a TSP circuit problem, and the optimal task sequence is derived accordingly. Table 6 summarizes the parameters and decision variables used in the TSP formulation.

Minimize \sum_{i \in V} \sum_{j \in V} (c_{i j} + S_{j}) x_{i j}

(1)

s . t . : \sum_{j \in V (j \neq i)} x_{i j} = 1, \forall i \in V

(2)

\sum_{i \in V (i \neq j)} x_{i j} = 1, \forall j \in V

(3)

y_{j} \geq y_{i} + n x_{i j} - (n - 1), \forall i, j \in V (i, j \neq 1)

(4)

The objective function, Equation (1), is to minimize the total travel time and work time. Equations (2) and (3) ensure that each task is visited exactly once, and Equation (4) is the subtour elimination constraints (SECs). The optimal node visiting order is determined by solving the TSP model. The travel paths between nodes are the same task-level paths calculated using the A algorithm* described in Section 3.2. Table 7 lists the Order IDs and corresponding Subtask IDs that are grouped into each Cluster ID. For example, Cluster ID 0 contains Subtask IDs 2187, 1485, and 2531 from Order IDs 2, 32, and 40. Table 8 provides the optimal execution sequence of those Subtask IDs for each cluster, as well as the Total Time required. For instance, Cluster ID 0 has a Subtask ID Sequence of 2531 → 2894 → …→ 1034 → 1026, with a Total Time of 63 min. Figure 7 illustrates the optimal path for Cluster ID 0 based on Table 8, where different colors represent each order and its respective subtasks, and the coral-colored paths and arrows visualize the robot’s movement direction along the optimal execution sequence.

3.5. Task Allocation for Each Robot

Various objective functions can be used to assign tasks optimally to multiple robots, such as minimizing total travel distance, reducing energy consumption, or minimizing robot idle time. In this study, the objective is to minimize the makespan, which refers to the time at which all robots have completed their assigned tasks. This approach ensures that tasks are distributed in a balanced manner across all robots. Clusters are assigned to robots based on the optimal execution time of each cluster, as determined from the traveling salesman problem (TSP) described in Section 3.4. The mathematical formulation for this task allocation model is presented in Table 9.

Minimize \max (\sum_{v \in V} c_{v} x_{k v}; k \in K)

(5)

s . t . : \sum_{k \in K} x_{k v} = 1, \forall v \in V

(6)

\sum_{v \in V} c_{v} x_{k v} \leq \bar{T}, \forall k \in K

(7)

The objective function, Equation (5), is to minimize the longest completion time among the robots, ensuring that every task is finished as swiftly and efficiently as possible. Narrowing down this makespan enhances productivity and streamlines operations to achieve excellence in performance. Equation (6) ensures that each cluster is assigned to exactly one robot, and Equation (7) ensures that all robots complete their tasks within the specified time limit,

\bar{T}

.

3.6. Scheduling for Each Robot

Each robot has a finite battery capacity, and when the remaining battery level is insufficient, it cannot execute the picking tasks required for a given cluster. To address this, we define a mixed-integer linear programming (MILP) model that determines both the execution sequence of clusters and the timing and amount of battery charging, based on the robot’s assigned tasks and battery constraints. Following the robot-specific task assignment described in the previous section, tasks are balanced among robots. The objective of this model is to minimize the robot’s total task completion time, which includes both task execution and charging durations. The model allows for partial battery charging, meaning the robot can charge only the necessary amount rather than reach full capacity. All charging is assumed to occur at the packing station. Table 10 presents the parameters and decision variables used in the proposed scheduling model.

Minimize \max (t_{n}; \forall n \in N)

(8)

s . t . : t_{1} = c_{1}

(9)

t_{n} \geq c_{n} + g_{n - 1} + t_{n - 1}, \forall n \in N \ \{1\}

(10)

b_{1} \leq F - d_{1}

(11)

b_{n} \leq F, \forall n \in N

(12)

b_{n} \leq b_{n - 1} - d_{n} + g_{n - 1}, \forall n \in N \ \{1\}

(13)

g_{n} \leq F - b_{n}, \forall n \in N

(14)

The objective function, Equation (8), is to minimize the time it takes for the robot to complete the last task. Equations (9) and (10) define the time progression after executing a cluster or performing a charging operation. While the time required for charging is assumed to be equal to the amount of energy charged, this assumption was adopted to simplify the MILP formulation and ensure computational tractability. However, in real-world applications, this linear relationship may not accurately reflect the actual charging behavior of lithium-ion batteries. Most commercial robots follow a non-linear charging curve where the rate of energy intake slows significantly after reaching approximately 80% capacity. Additionally, chargers may impose minimum charging thresholds, operate at variable power depending on state-of-charge and thermal conditions, or enforce constant-voltage phases. These factors can result in a longer total charging time than the model predicts. In future extensions, the framework may be enhanced by incorporating piecewise linear approximations or empirically derived charging profiles to improve the accuracy and realism of the scheduling outcomes. Equations (11)–(13) define the battery level after each cluster execution and charging operation, ensuring that the battery does not exceed its maximum capacity and decreases proportionally to the travel distance. Equation (14) governs the battery charging amount and allows partial charging, that is, the robot can charge only as much as is necessary to execute the upcoming clusters after time

n

. Table 11 presents the optimal scheduling results for each robot, and Figure 8 provides a visualization of the results.

4. Numerical Experiment and Results

4.1. Experimental Configuration

To evaluate the proposed framework, experiments were conducted by varying the number of robots and orders. Each combination of robot and order configurations was executed five times, and the average time required for each step of the framework was recorded. For Step 1 (path and distance search between products), the time required was initially high due to the absence of pre-stored data. However, as more product-to-product distances and paths were computed and cached, the search time decreased significantly.

By the final iteration, when all required paths and distances were already stored, the search time became negligible. Because of this, Step 1 was evaluated separately from the rest of the framework. It is also worth noting that the algorithm searches only for products relevant to the current orders, not for the entire product set, making it efficient even in early-stage use. In this experiment, 100 orders were randomly generated, and Step 1 (distance/path search) was repeated five times to simulate data accumulation and reuse. The entire framework was implemented in Python (version 3.11.5), and MILP models were solved using IBM CPLEX Solver (version 22.1.0.0).

4.2. Numerical Experiment 1—Distance and Path Search Time

In the first iteration, when no distance or path data had been stored, Step 1 required approximately 9 min. In the second iteration, previously computed distances and paths were reused, and only new ones were searched, reducing the time to around 7 min. By the fifth iteration, with most paths already stored, the search time dropped to about 2 min and 30 s. For comparison, a full search of all possible grid locations (without any reuse) required approximately 35 min. Table 12 summarizes the time required per iteration. Figure 9 and Figure 10 visualize the results after the 1st and 5th iterations, respectively. In these figures, green dots indicate tasks with completed search results, where the paths and distances were successfully cached.

4.3. Numerical Experiment 2—Framework Execution Time and Scalability

Table 13 presents the execution time required for each stage of the proposed framework. Each row corresponds to an independent experiment conducted with a different number of robots under the same order quantity. As the number of robots increases, the MILP-based task allocation and scheduling steps become more computationally intensive due to the growth in decision variables and constraints. Therefore, higher robot counts may lead to longer computation times despite offering greater flexibility. The distance and path search between products (Step 1) was executed only once and reused throughout the experiment, as described in Experiment 1. Among all steps, the TSP-based path planning within clusters required the longest computation time. This step was executed sequentially in the experiment; however, if performed in parallel, the total time can be significantly reduced. For instance, in a case involving 1000 orders, the longest TSP computation for a single cluster took only about 7 s when parallelized. Similarly, the scheduling step can also be parallelized. As a result, the entire framework is scalable, and for large-scale scenarios such as 100 robots and 1000 orders (approximately 4000 subtasks), the entire process—from scheduling to task allocation—can be completed in approximately 5 min. While numerous MRTA approaches have been proposed in the literature, most assume robotic mobile fulfillment systems (RMFS), dynamic environments, or reinforcement learning-based frameworks, which differ substantially from the static, zone-picking warehouse setting addressed in this study. These methods often handle single-subtask orders and rely on dynamic re-planning under uncertainty, whereas our model targets a structured multi-subtask assignment under spatial and battery constraints. In contrast to prior MRTA methods that are typically designed for one-time task allocation or reactive policies, our framework is architected for sustained use in real warehouse environments. By leveraging a batch-based optimization structure and reusable distance data, it supports repeated scheduling cycles with long-term efficiency and operational continuity in mind. Due to the lack of publicly available benchmark datasets that match this problem configuration, direct numerical comparisons with existing methods were not included. Nevertheless, the proposed framework demonstrated high scalability, completing task allocation and scheduling for up to 1000 orders and 100 robots within approximately five minutes. This level of performance highlights the framework’s potential for deployment in real-world, large-scale logistics systems.

5. Conclusions and Future Studies

In this study, we proposed a five-step optimization framework to address the task planning problem of multiple logistics robots operating in a zone-picking logistics center. In Step 1, the subtasks corresponding to products in each order were defined by their location, and the distances and paths between tasks were computed using the A* algorithm. The computed distances and paths were stored and reused to progressively reduce the computation time. In Step 2, clustering was performed based on the center location and capacity of products in each order to narrow the problem’s search space and further reduce computation time. In Step 3, the optimal path for executing tasks within each cluster was derived by formulating the traveling salesman problem (TSP). In Step 4, a mixed integer linear programming (MILP) model was constructed to assign clusters to robots in a way that minimizes the makespan (i.e., the completion time of all tasks across robots). Each robot’s assigned clusters were selected based on this objective. In Step 5, we incorporated battery constraints by optimizing the execution order, charging time, and charging amount for each robot, with the goal of minimizing individual task completion time. We validated the proposed framework through numerical experiments with varying scales, from 50 to 1000 orders and 3 to 100 robots, and demonstrated that the task assignment process can be completed within approximately 5 min, even for large-scale scenarios. Furthermore, since the makespan is used as the objective function, the model can be extended to determine the number of robots required to meet a specific deadline.

The framework is also highly flexible and adaptable, as it is not dependent on a fixed logistics center structure or environment, and can be easily modified to suit various settings. While the framework is developed based on a zone-picking warehouse environment where manual picking is prevalent, it is inherently designed to support robotic systems. Rather than merely assigning tasks, the framework generates complete execution trajectories—including detailed paths and optimized sequences—which can be directly interpreted by robotic control modules. As robotic picking technologies continue to evolve and gain adoption, this trajectory-level framework is well-positioned for integration into fully autonomous logistics operations without structural modifications. In summary, this study presents a practically relevant and scalable framework that reflects real-world logistics operations more closely than traditional ST-SR models. While many existing approaches focus on one-time optimization of incoming orders, our framework is designed for ongoing warehouse operations, enabling dynamic and repeated application based on accumulated path data and system integration. This highlights its novelty and potential for real deployment in intelligent logistics environments.

Limitations and Future Work

While the proposed framework demonstrates strong applicability, several simplifying assumptions limit its direct transfer to complex real-world settings. First, the environment is considered static, without dynamic obstacles or layout changes. Second, variability in worker behavior and productivity was not considered; all zones were assumed to operate with uniform capacity and speed. Third, orders and item locations were generated randomly rather than derived from real operational data. In future research, these limitations can be addressed by incorporating dynamic elements such as real-time order inflow, zone congestion, or stochastic travel times. Additionally, empirical datasets from actual warehouse operations may be integrated to validate the framework’s performance under realistic scenarios. Finally, the framework can be extended to accommodate hybrid systems that combine both manual and robotic picking processes. In addition, because the proposed framework is composed of five modular stages—from distance computation to scheduling under battery constraints—each component can be independently improved or replaced. Future work may incorporate more advanced algorithms for clustering, routing, or scheduling, as long as they align with the static MT-SR structure. This modularity ensures the framework’s adaptability to evolving technologies and operational demands over time.

Author Contributions

Conceptualization, H.K. and B.C.; methodology, H.K.; software, B.C. and M.K.; validation, H.K., B.C. and M.K.; formal analysis, H.K. and B.C.; investigation, H.K. and B.C.; resources, H.K. and B.C.; data curation, B.C. and M.K.; writing—original draft preparation, H.K. and B.C.; writing—review and editing, H.K. and B.C.; visualization, B.C. and M.K.; supervision, H.K.; project administration, H.K.; funding acquisition, H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Financial Program of Basic Protection Science Fields at Changwon National University in 2023 and the 5th Educational Training Program for the Shipping, Port and Logistics from the Ministry of Oceans and Fisheries.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results; further inquiries can be directed to the corresponding author.

References

Kim, J.-Y. A Study on Implement of QRcode Recognition System for Localization and Work Efficiency of Logistics Robot. Master’s Thesis, Hankyong National University, Anseong, Republic of Korea, 2019. [Google Scholar]
Seo, J.-H. Multi-Robot Path Planning for Dynamic Environments Including Multiple Waypoints Based on Genetic Algorithms and Reinforcement Learning. Master’s Thesis, Kyungpook National University, Daegu, Republic of Korea, 2023. [Google Scholar]
Hana Securities Co., Ltd. Logistics Robots (AMR) Overweight: Why Invest in Logistics Robots Now? Available online: https://www.hanaw.com/download/research/FileServer/WEB/industry/industry/2023/05/12/230512_AMR_industry.pdf (accessed on 26 May 2024).
Precedence Research. Logistics Robotics Market Size, Share, and Trends 2024 to 2033. Available online: https://www.precedenceresearch.com/logistics-robotics-market (accessed on 2 July 2024).
De Koster, R.; Le-Duc, T.; Roodbergen, K.J. Design and control of warehouse order-picking: A literature review. Eur. J. Oper. Res. 2007, 182, 481–501. [Google Scholar] [CrossRef]
Yuan, R.; Li, J.; Wang, X.; He, L. Multirobot Task Allocation in e-Commerce Robotic Mobile Fulfillment Systems. Math. Probl. Eng. 2021, 2021, 6308950. [Google Scholar] [CrossRef]
Azadeh, K.; Roy, D.; de Koster, R.; Khalilabadi, S.M.G. Zoning strategies for human–robot collaborative picking. Decis. Sci. 2023, 56, 50–70. [Google Scholar] [CrossRef]
Azadeh, K.; De Koster, R.; Roy, D. Robotized Warehouse Systems: Developments and Research Opportunities; ERIM report series research in management, ERS-2017-009-LIS; Erasmus Research Institute of Management: Rotterdam, The Netherlands, 2017. [Google Scholar]
Gerkey, B.P.; Matarić, M.J. A formal analysis and taxonomy of task allocation in multi-robot systems. Int. J. Robot. Res. 2004, 23, 939–954. [Google Scholar] [CrossRef]
Agrawal, A.; Bedi, A.S.; Manocha, D. Rtaw: An attention inspired reinforcement learning method for multi-robot task allocation in warehouse environments. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 1393–1399. [Google Scholar]
Sarkar, C.; Paul, H.S.; Pal, A. A scalable multi-robot task allocation algorithm. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 5022–5027. [Google Scholar]
Yuan, R.; Dou, J.; Li, J.; Wang, W.; Jiang, Y. Multi-robot task allocation in e-commerce RMFS based on deep reinforcement learning. Math. Biosci. Eng. MBE 2022, 20, 1903–1918. [Google Scholar] [CrossRef] [PubMed]
Oh, S. Multi-Agent Route Optimization for Robotic Mobile Fulfillment Systems. Master’s Thesis, Seoul National University, Seoul, Republic of Korea, 2020. [Google Scholar]
Chen, X.; Li, Y.; Liu, L. A coordinated path planning algorithm for multi-robot in intelligent warehouse. In Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), Dali, China, 6–8 December 2019; pp. 2945–2950. [Google Scholar]
Yang, S.; Zhang, Y.; Ma, L.; Song, Y.; Zhou, P.; Shi, G.; Chen, H. A novel maximin-based multi-objective evolutionary algorithm using one-by-one update scheme for multi-robot scheduling optimization. IEEE Access 2021, 9, 121316–121328. [Google Scholar] [CrossRef]
Shetty, N.; Sah, B.; Chung, S.H. Route optimization for warehouse order-picking operations via vehicle routing and simulation. SN Appl. Sci. 2020, 2, 311. [Google Scholar] [CrossRef]
Žulj, I.; Salewski, H.; Goeke, D.; Schneider, M. Order batching and batch sequencing in an AMR-assisted picker-to-parts system. Eur. J. Oper. Res. 2022, 298, 182–201. [Google Scholar] [CrossRef]
Martin, J.G.; Muros, F.J.; Maestre, J.M.; Camacho, E.F. Multi-robot task allocation clustering based on game theory. Robot. Auton. Syst. 2023, 161, 104314. [Google Scholar] [CrossRef]
Tihanyi, D.; Lu, Y.; Karaca, O.; Kamgarpour, M. Multi-robot task allocation for safe planning against stochastic hazard dynamics. In Proceedings of the 2023 European Control Conference (ECC), Bucharest, Romania, 13–16 June 2023; pp. 1–6. [Google Scholar]
Paul, S.; Ghassemi, P.; Chowdhury, S. Learning scalable policies over graphs for multi-robot task allocation using capsule attention networks. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 8815–8822. [Google Scholar]
Shibata, K.; Jimbo, T.; Odashima, T.; Takeshita, K.; Matsubara, T. Learning locally, communicating globally: Reinforcement learning of multi-robot task allocation for cooperative transport. IFAC-PapersOnLine 2023, 56, 11436–11443. [Google Scholar] [CrossRef]
Park, B.; Kang, C.; Choi, J. Cooperative multi-robot task allocation with reinforcement learning. Appl. Sci. 2021, 12, 272. [Google Scholar] [CrossRef]
Cai, Y. Online Scheduling of Operator Assistance for Multi-Robot Teams with Uncertain Robot Capabilities and Environments. Master’s Thesis, University of Waterloo, Waterloo, ON, Canada, 2023. [Google Scholar]
Hussein, A.; Khamis, A. Market-based approach to multi-robot task allocation. In Proceedings of the 2013 International Conference on Individual and Collective Behaviors in Robotics (ICBR), Sousse, Tunisia, 15–17 December 2013; pp. 69–74. [Google Scholar]
Choi, H.L.; Brunet, L.; How, J.P. Consensus-based decentralized auctions for robust task allocation. IEEE Trans. Robot. 2009, 25, 912–926. [Google Scholar] [CrossRef]
Dimmig, C.A.; Wolfe, K.C.; Moore, J. Multi-robot planning on dynamic topological graphs using mixed-integer programming. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 5394–5401. [Google Scholar]
Hong, S.; Hwang, I.; Jang, Y.J. Practical q-learning-based route-guidance and vehicle assignment for oht systems in semiconductor fabs. IEEE Trans. Semicond. Manuf. 2022, 35, 385–396. [Google Scholar] [CrossRef]
Chung, S.Y.; Hwang, M.J.; Yoon, H.J. Task Allocation Method for Multiple Manipulators to Insert Pem Nuts in Press Forming Process; Institute of Control, Robotics and Systems (ICROS): Seoul, Republic of Korea, 2016; pp. 102–103. [Google Scholar]
Ghassemi, P.; Chowdhury, S. Multi-robot task allocation in disaster response: Addressing dynamic tasks with deadlines and robots with range and payload constraints. Robot. Auton. Syst. 2022, 147, 103905. [Google Scholar] [CrossRef]
Shelkamy, M.; Elias, C.M.; Mahfouz, D.M.; Shehata, O.M. Comparative analysis of various optimization techniques for solving multi-robot task allocation problem. In Proceedings of the 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES), Giza, Egypt, 24–26 October 2020; pp. 538–543. [Google Scholar]
Liu, X.F.; Lin, B.C.; Zhan, Z.H.; Jeon, S.W.; Zhang, J. An efficient ant colony system for multi-robot task allocation with large-scale cooperative tasks and precedence constraints. In Proceedings of the 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, FL, USA, 5–7 December 2021; pp. 1–8. [Google Scholar]
Kong, X.; Gao, Y.; Wang, T.; Liu, J.; Xu, W. Multi-robot task allocation strategy based on particle swarm optimization and greedy algorithm. In Proceedings of the 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 24–26 May 2019; pp. 1643–1646. [Google Scholar]
Patle, B.K.; Babu, G.; Pandey, A.; Parhi, D.R.K.; Jagadeesh, A. A review: On path planning strategies for navigation of mobile robot. Def. Technol. 2019, 15, 582–606. [Google Scholar] [CrossRef]
Zhang, H.Y.; Lin, W.M.; Chen, A.X. Path planning for the mobile robot: A review. Symmetry 2018, 10, 450. [Google Scholar] [CrossRef]
Qin, H.; Shao, S.; Wang, T.; Yu, X.; Jiang, Y.; Cao, Z. Review of autonomous path planning algorithms for mobile robots. Drones 2023, 7, 211. [Google Scholar] [CrossRef]

Figure 1. Logistics robot market size 2023 to 2033 (USD BILLION) [4].

Figure 2. Example of logistics processing in RMFS [6]. The numbers in the figure represent the sequential steps in the robot’s logistics processing workflow.

Figure 3. Example of the zone-picking method.

Figure 4. Task scheduling framework for multiple logistics robots.

Figure 5. Example of an A* algorithm-based movement path between products.

Figure 6. Visualization of clustering results: (a) Cluster 0, (b) Cluster 1. The different colors represent individual customer orders, with dots indicating the subtasks within each order. Each star symbol shows the center point of an order (used for clustering).

Figure 7. Example of optimal task execution path for Cluster 0 based on TSP solution.

Figure 8. Visualization of task execution and charging sequence for each robot.

Figure 9. Visualization of results from 1 iteration. Green circles indicate tasks with completed search results.

Figure 10. Visualization of results from 5 iterations. Green circles indicate tasks with completed search results.

Table 1. Concept diagram and main functions of ACS.

Higher System Interface	Interfaces with upper systems (e.g., ERP, MES, WMS) to receive work orders, report results, and transmit status of ACS and logistics robots.
Path finding	Calculates optimal routes based on departure/arrival nodes to ensure efficient task execution.
Task allocation	Assigns and adjusts work orders based on robot status and workload.
Traffic control	Prevents congestion by predicting interference zones and rerouting robots to avoid delays.
Charging	Issues charging commands based on battery level and robot activity state (e.g., moving, standby) to maintain continuous operation.
Logistics robot control system operation concept

Table 2. Previous research on multi-robot task allocation in a distribution center.

Ref.	Method	Approach	Techniques	MRTA Type
[2]	GTP	Task allocation and path planning	Meta-heuristic, Reinforcement learning	ST-SR
[6]	GTP	Task scheduling	Meta-heuristic
[10]	GTP	Task allocation	Reinforcement learning
[11]	GTP	Task clustering	Clustering
[12]	GTP	Task allocation	Reinforcement learning
[13]	GTP	Path planning	MILP
[14]	GTP	Path planning	Path planning
[15]	GTP	Multi-robot scheduling	Meta-heuristic
[16]	PTP	Route optimization	MILP
[17]	PTP	Batching and sequencing	Heuristic

Table 3. Previous research on multi-robot task allocation in a various environmets.

Ref.	Environment	Approach	Techniques	MRTA Type
[18]	Thermal power plant	Task clustering and assignment	Game theory	ST-SR
[19]	Dynamic risky environment	Risk-aware task allocation	MDP	ST-SR
[20]	Fixed deadline tasks	Reinforcement learning	Graph RL, CapAM	ST-SR
[21]	Multi-robot coordination	Task selection	MDP + RL	ST-SR
[22]	Scalable robot/task space	Cross-attention RL	Deep RL	ST-SR
[23]	Human-robot system	Assisted task scheduling	Operator support model	MT-SR
[24]	Heterogeneous robot system	Auction-based allocation	Utility function	ST-SR
[25]	Multi-tasking system	Multi-allocation auction	Consensus bundle algorithm	MT-SR
[26]	Cooperative group planning	Task planning	Dynamic topology graph	ST-SR
[27]	Semiconductor transport	Vehicle-path assignment	Q-learning	ST-SR
[28]	Robotized assembly	Task sequencing	Collision-aware scheduling	ST-SR
[29]	Task-range-payload constraints	Multi-tour scheduling	Bipartite graph matching	ST-SR
[30]	General MRTA	Solver comparison	GA, ACO	ST-SR
[31]	Time-extended cooperation	MRTA with constraints	ACO	ST-SR
[32]	Large-scale MRTA	Task-robot pairing	PSO + Greedy	ST-SR

Table 4. Example of generated order and product unit subtasks.

Order ID	Subtask IDs	Center Point (x, y)	Capacity
0	1034, 2827	(32, 10)	0.09
1	1387, 1764, 1112, 1029	(22, 18)	0.21
2	2187, 1485, 2856, 1035, 744	(27, 29)	0.32
⋮	⋮	⋮	⋮
47	670, 2199, 1747, 2567, 2546, 2907, 1841	(34, 28)	0.40
48	755, 1064, 2144	(21, 41)	0.19
49	1395, 2917	(36, 26)	0.17

Table 5. Result of clustering.

Cluster ID	Order IDs	Total Capacity
0	2, 32, 40	0.91
1	4, 18, 47, 20	0.96
2	5, 14	0.62
3	6, 41, 48, 38	0.98
$⋮$	$⋮$	$⋮$
10	17, 45, 46, 28, 3	0.97
11	19, 21, 1, 37, 27	0.82
12	24	0.54
13	25, 43, 31, 22	0.89

Table 6. Notation of parameters and decision variables for the proposed TSP.

Parameters and Decision Variable
$V$		Set of sub-task nodes
$n$		Number of sub-task nodes
$c_{i j}$		$Travel time from node i$ $to node j$
$S_{j}$		$Work time at node j$
$x_{i j}$	{	$1, Move from node i$ $to node j$ 0, Otherwise
$y_{i}$		Artificial variables to prevent subtours

Table 7. Subtasks within a cluster.

Cluster ID	Order IDs	Subtask IDs
0	2, 32, 40	2187, 1485, 2856, 1035, 744, 2557, 1773, 1118, 1026, 1418, 2894, 2138, 679, 2509, 1058, 1061, 1034, 2531
1	4, 18, 47, 20	2888, 1126, 3206, 1094, 2135, 1086, 2925, 670, 2199, 1747, 2567, 2546, 2907, 1841, 2885, 1072
2	5, 14	3217, 2123, 1114, 1852, 2895, 2905, 2488, 2212, 2574, 1752
⋮	⋮	⋮
11	19, 21, 1, 37, 27	378, 745, 733, 1387, 1764, 1112, 1029, 1452, 1774, 725, 2857, 370, 1029, 1462, 1819
12	24	738, 1065, 2512, 1059, 2481, 2846, 2830, 2493
13	25, 43, 31, 22	3218, 1043, 379, 1814, 2183, 1085, 1477, 1777, 2474, 1752, 2472, 2893, 1386, 1425, 705, 1760

Table 8. Work order and total time by cluster.

Cluster ID	Subtask ID Sequence	Total Time
0	2531 → 2894 → 2856 → 2557 → 2509 → 2138 → 2187 → 1773 → 1418 → 1485 → 1061 → 1118 → 1058 → 744 → 679 → 1035 → 1034 → 1026	63 m
1	2885 → 2888 → 3206 → 2907 → 2546 → 2135 → 2199 → 2925 → 2567 → 1841 → 1072 → 1126 → 1094 → 670 → 1086 → 1747	64 m
2	1752 → 1114 → 1852 → 2212 → 2574 → 3217 → 2905 → 2895 → 2488 → 2123	37 m
⋮	⋮	⋮
11	1029 → 1029 → 725 → 370 → 378 → 733 → 745 → 1112 → 1774 → 2857 → 1764 → 1819 → 1462 → 1452 → 1387	53 m
12	738 → 1059 → 1065 → 2512 → 2493 → 2846 → 2481 → 2830	34
13	1752 → 1814 → 1760 → 2183 → 2474 → 2472 → 2893 → 3218 → 1777 → 1477 → 1425 → 705 → 379 → 1043 → 1085 → 1386	59

Table 9. Notation of parameters and decision variables for the proposed task allocation.

Parameters and Decision Variable
$K$		Set of robots
$V$		Set of clusters
$c_{v}$		$Time taken to perform cluster v$
$\bar{T}$		Time limit when all robots have finished their work
$x_{k v}$	{	$1, Robot k$ $performs cluster v$ 0, Otherwise

Table 10. Notation of parameters and decision variables for proposed scheduling.

Parameters and Decision Variable
$N$	Set of clusters assigned to a robot
$c_{n}$	$Execution time for cluster n$
$d_{n}$	$Distance traveled for cluster n$
F	Maximum battery capacity of the robot
$b_{n}$	$Battery level after completing cluster n$
$t_{n}$	$Time after completing cluster n$
$g_{n}$	$Battery charged after cluster n$

Table 11. Task performance sequence and end time for each robot.

Robot	Cluster Execution Order (Execution Time)	End Time
1	0(63) → 6(70) → Charging (19) → 7(59) → Charging (32) → 9(31) → 12(34)	308
2	1(64) → 2(37) → Charging (43) → 4(49) → 8(54) → Charging (15) → 11(53)	315
3	3(73) → 5(61) → Charging (44) → 10(63) → 13(59)	300

Table 12. Distance and path-finding time per iteration.

Iteration	1	2	3	4	5	Full Work
Time required	8 m 57 s	6 m 56 s	5 m 15 s	4 m 3 s	2 m 37 s	About 35 m

Table 13. Numerical experiment 2: Step-by-step time (seconds) results.

Number of Orders (Subtasks)	Number of Robots	Clustering	TSP	Task Allocation	Scheduling	Total
50 (203)	3	0.02 s	5 s	0.1 s	0.1 s	5.22 s
50 (203)	5	0.02 s	5 s	0.1 s	0.2 s	5.32 s
100 (409)	3	0.14 s	8 s	0.1 s	0.2 s	8.44 s
	5			0.1 s	0.3 s	8.54 s
	10			0.2 s	0.5 s	8.84 s
300 (1213)	5	4 s	65 s (5 s)	0.3 s	0.6 s	69.9 s (9.9 s)
	10			1.2 s	1.0 s	71.2 s (11.2 s)
	30			4.2 s	2.5 s	75.7 s (15.7 s)
1000 (4007)	30	142 s	397 s (7 s)	19 s	2.4 s	560.4 s (170.4 s)
	50			40 s	3.0 s	582.0 s (192 s)
	100			156 s	4.2 s	699.2 s (309.2 s)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Choi, B.; Kim, M.; Kim, H. An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots. Mathematics 2025, 13, 1770. https://doi.org/10.3390/math13111770

AMA Style

Choi B, Kim M, Kim H. An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots. Mathematics. 2025; 13(11):1770. https://doi.org/10.3390/math13111770

Chicago/Turabian Style

Choi, Byoungho, Minkyu Kim, and Heungseob Kim. 2025. "An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots" Mathematics 13, no. 11: 1770. https://doi.org/10.3390/math13111770

APA Style

Choi, B., Kim, M., & Kim, H. (2025). An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots. Mathematics, 13(11), 1770. https://doi.org/10.3390/math13111770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Optimization Framework for Allocating and Scheduling Multiple Tasks of Multiple Logistics Robots

Abstract

1. Introduction

1.1. Logistics Robot Trends

1.2. Introduction to Logistics Robots and Control Systems

1.3. Introduction to Order Picking and Its Methods

1.4. Research Distinctiveness and Purpose

2. Literature Review

2.1. Multi-Robot Task Allocation in Distribution Centers

2.2. Multi-Robot Task Allocation in Various Environments

3. Task Scheduling Framework

3.1. Problem Definition and Framework

3.2. Path Planning Algorithm

3.3. Clustering

3.4. Search for Optimal Work Order Within a Cluster

3.5. Task Allocation for Each Robot

3.6. Scheduling for Each Robot

4. Numerical Experiment and Results

4.1. Experimental Configuration

4.2. Numerical Experiment 1—Distance and Path Search Time

4.3. Numerical Experiment 2—Framework Execution Time and Scalability

5. Conclusions and Future Studies

Limitations and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI