1. Introduction
In recent years, with the rapid development and popularization of digitization, informatization, electronics, and networking technologies, e-commerce has experienced significant growth. This has led to the emergence of new characteristics in customer orders such as small sizes, large assortments, large quantities, and time sensitivity [
1,
2]. Against this background, distribution centers often face immense pressure and challenges in efficiently processing a great number of orders within a limited time. Achieving rapid order picking is key to relieving the operational pressure of distribution centers. The intelligence of warehousing systems enables pickers to quickly locate the storage locations of the products to be picked using handheld electronic devices [
3]. In this context, the traditional continuous storage mode gradually loses its advantages, and the scattered storage mode emerges accordingly [
4]. Under the scattered storage mode [
2], items of the same SKU are no longer necessarily stored together but can be stored in different storage locations. Pickers can quickly locate the storage locations of the products to be picked using handheld electronic devices in distribution centers. Scattered storage can ensure that products are always stored near the pickers, thereby reducing their picking distance. Based on different methods of product movement, there exist two types of order picking systems under the scattered storage mode: picker-to-parts and parts-to-picker [
5]. Although the parts-to-picker system has attracted the attention of researchers in recent years, so far most small- and medium-sized enterprises lack sufficient economic strength or are not able to take the necessary risks for its practical implementation [
6]. Therefore, studying the order batching and picking problems of distribution centers under the scattered storage mode in the picker-to-parts system (manual picking system) has important practical significance.
Storage location assignment and order picking are still the key problems affecting the operational efficiency of distribution centers under the scattered storage mode. Many scholars have conducted research on storage location assignment problems under the scattered storage mode. Weidinger and Boysen [
3] first introduced the concept of the scattered storage mode in their paper and established a mathematical model for the scattered storage location assignment problem (SSLAP) with the objective of minimizing the maximum distances of SKUs to some measuring point. On this basis, Jiang et al. [
7] established different 0–1 integer programming models considering the product correlation for the SSLAP and designed effective intelligent algorithms to solve large-scale problems. A comparison between the scattered storage mode and the traditional continuous storage mode was conducted through numerical experiments. The results show that the scattered storage mode has greater advantages under the current order characteristics. The above-mentioned papers all assume that the quantity of products stored at each location is sufficient. Liu and Poh [
8] further considered the storage capacity constraint at each location and introduced “SKU-specific bully locations” to handle the SSLAP when there are abnormal fluctuations in order quantities, which can enhance the flexibility of the scattered storage strategies. Albán et al. [
9] developed a mathematical model for the SSLAP to minimize the sum of pairwise distances between product locations (including a depot) for the same orders for a warehouse layout that included multiple depots, and an MIP solver was used to solve the model. An efficient variable neighborhood search metaheuristic was developed by Albán et al. [
10] to solve large-scale problems. Compared with the volume-based policy and the random scatter policy, the experimental results show that the policy they proposed is better. To sum up, the scattered storage mode is suitable for B2C distribution centers in which there is a large assortment and quantity of small-sized, time-sensitive orders.
Order picking is one of the most time-consuming and labor-intensive operations among the various operations in B2C distribution centers, accounting for 50% to 55% of the entire warehouse’s operating costs [
11,
12]. To enhance the efficiency of manual order picking in B2C distribution centers and complete order picking within the time windows required by customers, it is essential to minimize the walking distance for pickers as much as possible. Although adopting the scattered storage mode can reduce picking times to a certain extent, the operations of B2C distribution centers often combine different picking methods based on actual conditions to further improve the picking efficiency. Depending on the number of orders to be picked each time, order picking methods can be categorized into single order picking and batch order picking. Single order picking has a lower error rate and is easier to implement, while batch order picking is more efficient [
13]. With the rise and continuous development of online shopping, distribution centers often need to process hundreds of orders within a short time, and each order contains a small number of products. In such cases, single order picking may cause pickers to walk the same routes or access the same storage locations repeatedly. Therefore, batch order picking is commonly used to improve the order picking efficiency in practice [
14]. Obtaining high-quality batching results is key to enhancing the efficiency of batch picking.
At present, there is a scarcity of studies on the order batching problem under the scattered storage mode. However, the objective function expressions of the order batching models under the scattered storage mode are essentially similar to those of traditional order batching models. Traditional order batching models can be roughly classified into three types based on their objective functions. The first type aims to minimize the distance of order picking. Since the length of picking routes is involved, there is usually a joint optimization of order batching and routing problems [
15,
16]. The second type focuses on minimizing the order fulfillment time. This type can be further divided into two main research directions: (1) reducing the total order picking time to improve the order fulfillment efficiency [
17,
18,
19]; and (2) reducing the total tardiness of all customer orders to improve customer satisfaction [
20,
21]. The third type aims at maximizing the similarity between orders by grouping together orders with a high similarity within the same batch. Order similarity is measured by the quantity of the same products in two orders [
22,
23].
For the order batching problems under the scattered storage mode, Yang et al. [
24] developed an order batching model with the objective of minimizing the total picking distance for a given set of orders, corresponding to the first type of the traditional batching model mentioned above. Rasmi et al. [
25] developed an order batching model with the objective of minimizing the total order picking time, corresponding to the second type of the traditional batching model mentioned above. In this paper, we will establish a model for order batching to maximize the sum of pair-to-pair order correlations in all batches, corresponding to the third type of the traditional batching model mentioned above. Different from the third type of the traditional batching model mentioned above, this paper uses the order correlations to cluster orders. In the clustering, the same products are considered as well as the internal relationships among different products.
It is widely acknowledged that the traditional order batching problem is NP-hard [
26], and the order batching problem under the scattered storage mode is also NP-hard. Therefore, it is necessary to design effective intelligent optimization algorithms for the order batching problem under the scattered storage mode. Yang et al. [
24] studied the order batch picking optimization problem for a storage system in a situation in which there are multiple locations and developed a greedy seed batching (GS) algorithm to solve the order batching problem. Rasmi et al. [
25] studied the problem of wave picking systems under the mixed-shelves storage strategy and proposed a method for order batching, deploying an approach similar to 
k-means clustering. Both algorithms of order batching mentioned above used the picking distances to cluster orders, and they can only be used if the storage locations of each product in the warehouse are known.
Although there are few studies on order batching problems and solving algorithms under the scattered storage mode, the algorithms to solve the order batching problem under the traditional storage mode have been widely studied. For example, the seed batching algorithm [
19,
27,
28,
29,
30], genetic algorithms [
31,
32], variable neighborhood search [
33,
34,
35], and so on. The seed batching algorithm as one of the most common order batching algorithms has been proven to significantly improve order picking efficiency [
28,
29]. Moreover, the seed-order selection rule, accompanying-order selection rule, and storage location assignment strategies all have significant impacts on the performance of the seed batching algorithm [
30]. Jiang et al. [
7] studied the storage location assignment problem considering the correlation between products under the scattered storage mode, and the two products with a higher correlation can be stored closer in the process of storage location assignment. Under such storage location assignment schemes, if the products with higher correlations can be assigned to the same batch for picking, the length of the picking routes can be effectively reduced. If the correlation between products can be fully considered in the process of order batching, the quality of batching can be effectively improved. Based on this, we propose new seed batching algorithms considering the correlation between products in this paper, including a new seed-order selection rule based on the correlation between products and a new accompanying-order selection rule based on the correlation between products. In order to further improve the quality of the batching results obtained by the seed batching algorithms, we can use intelligent optimization algorithms to improve them. Tabu search (TS) is a common heuristic search algorithm that is often combined with other algorithms to solve combinatorial optimization problems. Lu et al. [
36] proposed a bilevel schedule risk control model using a hierarchical decision-making structure to optimize project scheduling and mitigate risks in IT outsourcing. The authors introduce a two-level tabu-predatory search algorithm, which integrates tabu search and predatory search techniques to navigate the solution space effectively. Their approach is validated through numerical experiments, demonstrating its effectiveness in improving decision support for scheduling risk control. The TS algorithm is also widely used in order batching problems [
24,
37,
38]. Therefore, the TS algorithm will be used to improve the seed batching algorithms to obtain better batching results.
In addition to order batching, reasonable route planning is also crucial for enhancing the order picking efficiency of distribution centers. The walking time accounts for over 50% of the total order picking time in picker-to-parts order picking systems [
39]. Under the traditional storage mode, each type of product is stored in a continuous area and the storage location for each type of products is fixed. Thus, it is easy to find the shortest route to connect all the locations of the products to be picked [
40]. However, the same products may be stored in multiple different locations under the scattered storage mode. Different from route planning under the traditional storage mode, the location must be firstly selected from the multiple storage locations for each product to be picked before planning the route.
Currently, order picking algorithms under the scattered storage mode are usually executed in two layers [
24,
41,
42]. The first layer is the location selection algorithm, which is used to select the storage location for each product to be picked. The second layer is the routing algorithm, and it finds the shortest picking route to ensure that all storage locations selected in the first layer are visited. For the location selection in the first layer, Weidinger et al. [
41] proposed three different precedence rules to select the storage locations to be visited, combining storage location distances and inventory. Subsequently, Weidinger et al. [
42] further studied the routing problem at scattered storage warehouses with multiple depots, and the above three precedence rules were combined to select storage locations. Yang et al. [
24] developed seven heuristic algorithms for the location selection and carried out simulation experiments using them. The experimental results show that the performance of the minimum increment–local greedy algorithm is the best. For route planning in the second layer, since the storage location of each product to be picked has already been determined by the first-layer algorithm, the route planning under the scattered storage mode is no different from that under the traditional storage mode. Meta-heuristic algorithms are very popular in dealing with various route planning problems, such as genetic algorithms [
40,
43], ant colony algorithms [
44,
45], and so on.
To sum up, the order batching and picking problems in distribution centers under the scattered storage mode are studied, and effective order batching and picking algorithms are designed to improve the operational efficiency of distribution centers. Different from the order batching problem under the traditional storage mode, a new order batching problem under the scattered storage mode is studied, whose feature is to improve the batching quality by considering the correlation between products. To solve large-scale problems, we propose two new seed batching algorithms based on the correlation between products. Then tabu search (TS) is used to improve these two algorithms. Finally, an improved two-stage order picking algorithm is proposed to verify the batching results obtained by different batching algorithms, including the four batching algorithms proposed in this paper and the two existing algorithms [
24,
27]. The processes of order batching and order picking under the scattered storage mode are shown in 
Figure 1. Within a certain period, multiple orders arrive at the distribution center and wait to be picked. Batch picking is often employed to deal with these orders efficiently. First, the orders are grouped into several batches according to certain clustering rules. Subsequently, the specific products to be picked for each batch can be further determined. After the order batching is completed, the orders in each batch need to be picked. As mentioned above, order picking under the scattered storage mode can be divided into two layers: location selection and route planning. First, select suitable locations for each product to be picked. Then, find the shortest route to ensure that each selected location can be visited.
The warehouse layout and the definition of storage distances in this article are similar to those defined by Jiang et al. [
7], as shown in 
Figure 1c. The number in the upper right corner of each square represents the index of the storage location, and the number in the center of each square represents the product stored at the location. Since order picking will be conducted in this paper, the position of the picking depot is shown here. The zero in 
Figure 1c indicates the index of the depot; the distance between any two adjacent storage locations in the same aisle is one (such as 1–2); the distance between opposing storage locations on either side of the same aisle is one (such as 4–5); the distance between the adjacent storage location in the different aisles is two (such as 8–9); and the distance between the depot and each aisle is equal to the distance between the depot and the storage position with the smallest number in each aisle (such as 0–1 and 0–9); Storage locations in the same color mean they hold the same products. For instance, locations 1, 5 and 9 are marked green, and product 1 is stored in these locations. 
Figure 1a provides information on seven arriving orders within a certain period, such as Order 1 containing products 1, 2, and 5; 
Figure 1b shows the batching results of these seven orders. For example, Order1 and Order2 are assigned to Batch1, and Batch1 contains products 1, 2, 4, and 5; 
Figure 1c illustrates the order picking process of Batch1. The storage locations (1, 3, 6, and 2) shaded in green represent the locations to be visited when picking products 1, 2, 4, and 5 of Batch1; the line with arrows represents the picking route of Batch1, which starts from the depot, sequentially passes through storage positions 1, 2, 3, and 6, and returns to the depot.
The remainder of this paper is organized as follows. 
Section 2 gives a more detailed description of the order batching problem considering the correlation between products under the scattered storage mode. The problem is formulated as a 0–1 integer programming model to maximize the sum of pair-to-pair order correlations in all batches. Two seed batching algorithms considering the correlation between products are proposed, and then a tabu search (TS) algorithm is used to further improve them in 
Section 3. An improved two-stage order picking algorithm is proposed in 
Section 4. A new seed batching algorithm for situations in which the storage locations of products are known in 
Section 5. 
Section 6 gives the results of the numerical experiments. Finally, the conclusions are presented in 
Section 7.
  5. A New Seed Batching Algorithm for Situations in Which the Storage Locations of Products Are Known
In 
Section 3, different batching algorithms based on the correlation between products are proposed, which can obtain batch results quickly without accurate information on the storage locations of each product. These algorithms have a wide range of application scenarios and can adapt to more complex picking situations. However, if the storage locations of each product are known at the time of order batching, we can use specific storage information to obtain more accurate batching results. For example, the greedy seed (GS) batching algorithm [
24] is a common batching algorithm that divides order batches according to the picking distance increment when the storage locations of each product are known. From this algorithm, the picking results can be fed back into the order batching process, thereby achieving the goal of reducing the total picking distance. It can obtain the batching results based on the degree of similarity of order picking routes. By calculating the picking distance increment of each order, orders with a smaller picking distance increment are divided into the same batch, and the goal of minimizing the order pick distance is achieved. However, under the scattered storage mode, the two processes of storage location selection and route planning need to be repeated many times, when using picking distance increments to measure the order similarity. Therefore, the calculation is very large, and the computation time is too long.
Based on this, a new seed batching algorithm is proposed, in which the information from the first stage of order picking (storage location selection) will be fed back into the order batching process. The similarity degree of picking locations is used to divide order batches in this paper. Compared with GS, the computation time can be significantly reduced. For the convenience of description and computation, picking nodes are defined as the blue nodes in 
Figure 6. The uncolored circles indicate the storage locations in the warehouse, and the blue circles indicate the picking nodes corresponding to the storage locations. As can be seen from 
Figure 6, in the warehouse layout, each of the pairs of rows of storage locations share a picking aisle. As a result, the storage locations on either side of the same aisle share the same picking node. As illustrated, location 1 and location 20 share picking node 1. Based on the above definition, the steps of the new seed batching algorithm are as follows. First, the location selection algorithm MIG* is employed to determine the optimal picking location set for each order. Then, the corresponding picking node set of each order is obtained based on the set of the picking locations. Finally, the seed algorithm is similarly applied for order batching. The algorithm is named the node-based seed (NBS) batching algorithm. The proportion of the same picking nodes between the seed order and order 
i (
i∈
O) is designated 
Pni, which is used to measure the order similarity. Formula (11) shows how to calculate 
Pni, in which 
Ni represents the number of picking nodes of order 
i, and 
Nα represents the number of picking nodes of seed order 
α.
Seed-order selection rule: Select the order with the largest number of picking nodes as the seed order in the given order pool.
Accompanying-order selection rule: Select order i with the biggest Pni as the accompanying order in the order pool.
The specific steps of the NBS batching algorithm are essentially consistent with those of the CBS batching algorithm in 
Section 3.1.