Multi-Robot Routing Problem with Min–Max Objective

: In this paper, we study the “Multi-Robot Routing problem” with min–max objective (MRR-MM) in detail. It involves the assignment of sequentially ordered tasks to robots such that the maximum cost of the slowest robot is minimized. The problem description, the different types of formulations, and the methods used across various research communities are discussed in this paper. We propose a new problem formulation by treating this problem as a permutation matrix. A comparative study is done between three methods: Stochastic simulated annealing, deterministic mean-ﬁeld annealing, and a heuristic-based graph search method. Each method is investigated in detail with several data sets (simulation and real-world), and the results are analysed and compared with respect to scalability, computational complexity, optimality, and its application to real-world scenarios. The paper shows that the heuristic method produces results very quickly with good scalability. However, the solution quality is sub-optimal. On the other hand, when optimal or near-optimal results are required with considerable computational resources, the simulated annealing method proves to be more efﬁcient. However, the results show that the optimal choice of algorithm depends on the dataset size and the available computational budget. The contribution of the paper is three-fold: We study the MRR-MM problem in detail across various research communities. This study also shows the lack of inter-research terminology that has led to different names for the same problem. Secondly, formulating the task allocation problem as a permutation matrix formulation has opened up new approaches to solve this problem. Thirdly, we applied our problem formulation to three different methods and conducted a detailed comparative study using real-world and simulation data.


Introduction
In Multi-Robot Systems (MRSs), a group of robots aim to perform one or more tasks together or individually. These systems are widely used in real world applications like warehouse automation [1], search and rescue [2], exploration [3], mapping and surveying [4], etc. The problem of assigning several robots to one or more tasks, provided that some specific constraints are met, is the Multi-Robot Task Allocation (MRTA) problem. The problem comes in different variants depending on the assumptions, constraints, and objective function. Each variant of the problem is an NP-hard problem and requires different approach to address. The difference depends on the assumptions of the problem, the objective function of the problem, and the constraints. The assumptions depends on the requirements of the problem and the resources available. The objective could be to minimize or maximize the distance or time or any cost associated with the task allocation. The objective function can be linear or non-linear or even with multiple objectives. The constraints of the problem depend on the nature of the robots, tasks, and the problem itself. Some examples of robot constraints are-carrying capacity of the robots (capacity constraints) [5] and the navigational space or locomotion of the robots like differential robots, drones, etc. (spatial constraints) [6]. Some examples of task/problem constraints are-a specific task cannot be done until another task is completed (precedence constraints) [7], tasks should be done on a specific schedule (scheduling problem) [8], some tasks require multiple robots (multi-robot tasks) [9], the robots should come back to their initial position after completing all the tasks (depot problem) [10], tasks should be done within a given time frame (timed tasks) [11], the number of robots required to do all the tasks should be minimized (minimum robot utilisation) [12], and robots should adapt to human intervention (human-robot collaboration) [13], etc.
There are many approaches for solving an MRTA problem. The brute force approach is exhaustive search through all possible solutions, which is highly time consuming and infeasible. Instead of exploring all of the search space, the Branch and Bound (and its variants) [14] can be used, which reduces the search space considerably. However, even with such methods, finding the minimum solution is intractable for smaller size problem sets and the computation time grows in factorial with the size of the problem. Meta-heuristic approaches adapted from the Operations Research (OR) community are mostly used to find approximate solutions to these problems. Some of the related works on MRTA using meta-heuristic approaches are: Evolutionary approaches like genetic algorithm [15], swarm algorithms like ant-colony optimization [16], particle swarm optimization [17]; stochastic approaches like simulated annealing [18], etc. The most common approach is to formulate as an integer programming problem and use commercial solvers like CPLEX, Gurobi, or any of the meta-heuristic approaches. Recent works on hybridization and parallelization of these approaches [19] have also produced good results [20]. Other approaches include behavior-based approach [9], market [21] and negotiation-based methods [11,22], etc. Moreover, the MRTA problems are also dealt by the Mathematics community as a combinatorial problem and they devise other approximation methods for the same.
In the last decade, the robotics community has extensively started to work on a specific MRTA problem which is quite useful to real-world applications. In this problem, a number of robots are required to visit a certain number of goals in random order for various applications like indoor warehouses [23], automated container terminals [24], search and rescue (coverage problems) [2], surveillance [4], etc. These problems are generally termed as Multi Robot Routing Problems (MRR). The general objective is to minimize the makespan of all the robots. However, this type of objective can yield solutions where one or more robots end up taking the major share of the load and other robots finish the task early. So, instead, a new objective is defined which is to minimize the maximum of the make-span of the slowest robot. Such an objective, termed as the "min-max objective", helps in reducing the load on a single robot and distributes it evenly among all robots. Thus, it reduces the breakdown of robots due to unbalanced task distribution. In our paper, this MRTA problem termed the MRR problem with min-max objective (MRR-MM) is studied in detail.
This paper is organised as follows. A detailed definition and description of the problem and the different approaches across various research communities is discussed in Section 2. An in-depth study on the literature review is detailed in Section 3. The permutation matrix approach is explained in Section 4 and the different methods used are briefly discussed in Section 5. Section 6 enumerates the results from the comparative study of the three main methods with different types of data sets using simulations and experiments. Section 7 gives a comprehensive discussion on the permutation matrix approach, and concluding remarks are given in Section 8.
The contribution of this paper is threefold. The paper provides a detailed study of the MRR-MM problem. It highlights the importance of the problem formulation and the different ways of approaching it across various research communities. Secondly, formulating the task allocation problem as a matrix formulation has opened up new ways to approach this problem. Thirdly, the paper provides a detailed comparative study of three different methods for solving this MRR problem with min-max objective using simulations and real-world experiments.

Problem Definition
The MRR-MM problem is defined as: To allocate a set of ordered tasks to each robot so as to minimize the maximum cost incurred by any one of the robots to sequentially execute its allocated group of ordered tasks. Our problem is illustrated in Figure 1 for a case of 3 robots and 4 tasks. The blue lines indicate the utility costs of the robots to each of the tasks. The red dashed lines are the utility costs between the tasks. Based on the utility costs, the tasks are allocated to each of the three robots in a such a way that this set and sequence of tasks minimizes the highest cost incurred by any one of the robot.
The problem can be described as: Given: • M is the number of robots • N is the number of tasks (N ≥ M) • Utility cost for doing each task • Utility costs between robots-robots, tasks-tasks, and robots-tasks The MRR with min-max objective is described pictorially at the top, where the blue lines are utility costs between robots-tasks and tasks-robots and the red dashed lines are utility costs between tasks. The expected solution is the assignment of tasks in a sequential order such that the maximum cost incurred by any one of the robots is minimized.
Objective: Minimize the maximum cost incurred by any one of the robots (when ordering the tasks in sequence and assigned to a robot) such that, Constraints: • Each robot does at least one task. • Each robot can have different start locations (multiple depots). • The robots are not required to return to their starting locations (depots) after the task completion. • The utility costs are asymmetric (the cost between two points A → B is not the same as between B → A). • The utility costs need not obey triangle inequality.

Problem Name
The MRR-MM problem has over the years been given different names by different researchers. For example, Parker [25], who was the first to address this problem, named it the Alliance Efficiency Problem (AEP), and Lagoudakis [21] named it the Multi-Robot Routing problem. Most recently, Turpin [26] named it the Minimum Maximum Multiple Depot Vehicle Routing Problem (MMMDVRP). The problem can also be called as Min-Max Multiple Depot multiple Asymmetric Traveling Salesman Problem (MMMD-mATSP), or Asymmetric Bottleneck multiple Traveling Salesman Problem (ABmTSP) [27]. However, throughout this paper, the term "Multi-Robot Routing problem with min-max objective" (MRR-MM) will be used for this problem.

Assumptions
In practical case, the number of tasks N should be much larger than the number of robots M. The task allocation is done based on two costs. First, the "utility cost", which can be considered as the path length or the time taken by the robot (transit cost) to move between two consecutive tasks. Secondly, the "utility cost" for doing a single task. This utility cost is very important for robots involved in warehouse or mobile manipulator robots and hence it is considered here. All robots have identical capabilities and can do any of the available tasks. The problem allows asymmetric utility costs. So, the utility costs need not obey the triangle inequality. The transit costs between tasks depend on the order. The starting points of the robots may or may not be different for the robots (multiple depots). It is also not required for a robot to return to its start position after completing the tasks (open tours). It should be noted that the problem does not focus on the scheduling of tasks to each robot, i.e., the time at which the task is done. Instead, it focuses on the sequential ordering of tasks and their assignment to each robot based on the utility costs between the tasks. The temporal constraint as waiting times, arrival times, etc. can also be added to the utility costs but the problem formulation does not deal with the time schedule for the robots.
In case of real-world robotics scenario, we consider the environment similar to that of an automated warehouse. The communication bandwidth issues or map uncertainties are ignored and, hence, it is assumed that all information is available to all robots at all times. The acceleration-deceleration of the robots are also not considered in our study assuming the robots move with a constant velocity. This work's main focus is on finding initial optimal paths for all the robots. The optimal paths are subjected to change when the robots' paths conflict with each other causing collision (this work is a part of an upcoming paper). In our paper, we assume the robots follow traffic rules [28] or prioritization [29] to avoid this collision. The number of robots and tasks used in the experiments are taken from a typical warehouse data [30].

Problem Classification
There are several taxonomies to classify variants of MRTA problems [31][32][33]. The taxonomies do not contradict each other, but provide additional information on the problem category. According to Gerkey [31,34], MRR-MM problem can be classified as Single robot Tasks-Single task Robots-Time-extended Assignment (ST-SR-TA) problem. Though most ST-SR-TA problems can be treated as Single robot Tasks-Single task Robots-Instantaneous Assignment (ST-SR-IA) problems with a time-extended schedule, our problem cannot be treated as one. The MRR-MM problem is harder than ST-SR-TA as the objective deals with allocating a group of tasks sequentially to the robots. However, our problem can also be treated as an unweighted scheduling problem at every time instant according to Brucker's terminology [35].
According to Korsah et al.'s iTax taxonomy [32], our problem can be classified as In-Schedule Dependencies ID (ST-SR-TA). This is because the robot-task assignment depends on other robots and tasks (intra-schedule dependencies) when the main objective of the problem is to minimize the maximum of the slowest robot of all the given robots.
By Nunes et al.'s taxonomy [33], this problem can be considered under ST-SR-TA:TW-Time Windows with soft temporal constraints and deterministic allocations. There is no time window constraint in this problem. However, it consists of the other two sub-problems in this category: Assignment and ordering. Moreover, when the precedence constraints are enforced in this problem, it may be categorized under ST-SR-TA:SP-Synchronization and Precedence with soft temporal constraints and deterministic allocations.

Computational Complexity
Even a simplified version of the MRR-MM problem is NP-hard. This was proved by Parker [36], who compared it to the famous Partition problem [37], which is NP-complete, and showed that it is a special case of this problem. Hence, this problem is NP-hard with the number of tasks; an optimal solution cannot be found in a reasonable time when the problem size is large and heuristic methods are required to find approximate solutions for this problem. So, Arkin et al. [38] developed a 7-approximation algorithm for this problem.

Related Work
In this section we review the "MRR-MM problem", as dealt with by different research communities and more specifically the approaches by robotics community. The works listed below are also closely related to our problem but also differ from it in some factors, like symmetric costs or single depot (one start/end point for all the robots or the robots are are required to come back to this point), or the objective to minimize the sum of the total costs (not min-max), or some additional constraints like neighborhoods visitation or obeying Dubin's path.

Research Communities
The MRR-MM problem is very important and an appealing problem to a huge number of researchers across different research communities and domains. Due to a lack of inter-disciplinary research terminology, researchers have used names in-line with their community vocabulary.

Operations Research
For example, some OR researchers have dealt with the variants of this problem under the umbrella terms as Multiple Depot Multiple Traveling Salesman Problem [39], Multi-Vehicle Routing Problem [40], Pick Up and Delivery Problem with open depots [41], Min-Max Multi-Depot Vehicle Routing Problem [27], Multiple Traveling Salesman Problem with min-max [10,42], etc. These problems differ with ours in terms of special constraints adapted specifically for its application. The multiple Asymmetric Traveling Salesman Problem (m-ATSP) with min-max objective [27] is similar to our problem but allows multiple depots (equally many depots as salesmen), and "nomadic" robots (i.e., the robots can end in other places than they started). Other OR researchers use a generic name and do not explicitly mention the unique constraints, using terms as Scheduling of multiple trucks [24,43], Nurse Location problem [44], Communication/Location routing problem [45], Vehicle routing problem [46], Orienteering problem [47], etc.

Mathematics
The Mathematics community has contributed with fast and better approximation algorithms for NP-hard and NP-complete problems of this type [48,49]. They have also worked on our problem and its variants but use different names for it. Combinatorial problems like the classical set cover problem [50], the N path cover problem [38], Generalized Assignment Problem [51], Shortest Common Super String Problem [52], Dial-a-Ride problem [53], etc., are some examples that have been widely explored. Arkin et al. [38] devised a number of approximation algorithms for different variants of the vehicle routing problems. One such problem variant was the min-max N path problem, which is similar to this problem, with the difference that the robots do not have multiple start locations. The proposed approximation algorithm had a sub-optimality bound, which guarantees that the solution will not be more than four times the optimal solution.
3.1.3. Robotics/Artificial Intelligence MRR problems have been widely researched by the robotics community in the last decade. The terms "scheduling", "assignment", and "allocation" are used interchangeably. "Assignment" is the term used for assigning robots to tasks (usually a one-to-one mapping or linear assignment). "Allocation" is the term used for grouping tasks together, and then assign them to one or more robots (one-to-many assignment). "Scheduling" considers the time instant at which the task allocation or assignment should be done by each robot. This leads to many name variants. Another reason is that the AI community addresses this problem from the Multi-Agent perspective and the robotics community approaches this problem from the Multi-Robot perspective. For example, Multi-robot (task) Assignment Problem for Grouped Tasks (MAP-GT) [54], Target Assignment and Path Finding (TAPF) [23], and Multi-Vehicle Multi-Goal Planning [55]. All these problems include assignment or allocation of tasks/goals to robots. Similarly, the Set-cover problems are termed as coverage planning problem of drones/robots and also termed as surveillance planning problems. For example, Faigl et al. [55] treated our problem as a Multi-Vehicle Dubin's Traveling Salesman Problem with Neighborhoods. The authors studied the MRR-MM problem with two additional requirements: One to ensure that the task assignment respects Dubin's curvature and one that allows the robots to visit a small neighbourhood around the tasks/goals instead of a specific task location. Another related work is by Lagoudakis et al. [21] who studied three different objectives, "min-max", "min-sum", and "min-avg", and solve the problem using an auction algorithm. The main difference from our problem is that Lagoudakis et al. [21] require symmetric utility costs that obey the triangle inequality between robots and tasks. Other methods try to reduce or transform the problem into a single TSP and apply the existing exact TSP-algorithms for the same and then combine the solutions [10].

Approaches
In this section, we review all the problem formulations and methodologies that have been used for solving this particular problem.

Behavior-Based Models
In 1995, Parker [25] suggested the MRTA distributed architecture L-ALLIANCE, where robots learn from their experiences to schedule and estimate their utility cost. The method focused on fault-tolerance and adaptability rather than optimality or computational resources, since the major goal was a fault tolerant system that could work with fallible robots with uncertain sensors in dynamic environments. Parker designed a fully distributed, behavior-based architecture and tried to solve this problem by giving each of the robot capabilities to be fully autonomous and work in cooperation even if one or more robots failed in the team. Each robot had a behavioral set and the use of motivational behavior mechanism helped a robot choose tasks and accordingly adapt the environment and other robots. Later, a learning mechanism was implemented for this architecture [36], whereby the robots could learn from experiences. It was claimed that if "L-ALLIANCE" was well-trained, it could outperform the greedy approach but it was not guaranteed to find the optimal solution.

Combinatorial Optimization
As mentioned before, Faigl et al. [55] treated this as a multiple Dubin's Traveling Salesman Problem with Neighbourhoods (m-DTSPN). The problem was specifically designed for the [56] MBZIR competition, where three UAVs (Unmanned Aerial Vehicles) had to visit 22 targets so that the cost of the slowest UAV was minimized. The UAVs generally require smooth trajectories so that the low-level trajectory controller can follow the path, and so the Dubin's model was introduced. Moreover, the UAVs were required to only reach the neighbourhood of the targets to capture the target, so a Variable Neighbourhood Search method along with Self-Organizing Maps were used. The objective was to quickly get satisfiable solutions and did not, therefore, focus on the computationally demanding optimal solutions. A simple implementation of this method for our problem case is to introduce "k-means clustering" of the group of tasks and use a TSP solver for each cluster.

Auction-Bidding Algorithm
The auction-based coordination method by Lagoudakis et al. [21] is a system where all the robots are bidders and the targets/goals are goods. In each round of bidding, the robots exchange their bids with each other and each one of them compute the lowest bidder locally and in parallel. By doing so, there is a decentralized task allocation for the robots. Moreover, the selection of bids and the cost computation determine the quality of the solution. The algorithm then used insertion heuristics for each local group to quickly find solutions, which also meant non-optimal solutions. The method requires symmetric costs between robots and tasks, which cuts the search space in half. The paper presents the theoretical analysis of the auction method and its application to small size problems but no real-world experiments were conducted.

Graph Theory
In this formulation, an undirected graph G = (V, E) is constructed (either in a stochastic or deterministic manner), where the robots and tasks are vertices on the graph and the edges are the utility costs connecting these vertices. This becomes a multi-digraph (or directed multi-graph) with asymmetric costs. The problem is considered to be solved if one can find M discrete edges (one for each robot), starting with the robot vertices and connecting some tasks' vertices in sequential order such that the cost of the slowest robot edge is minimized. So, there are M N possible assignments, and for each assignment the minimum cost Hamiltonian path with N! permutations needs to be found. Turpin et al. [26] approached this problem using a very fast approximation algorithm that produced nearoptimal and complete solutions. The algorithm first found the minimum spanning forest for all the task vertices and sought to find an approximate Hamiltonian path, which was then divided into M sequence paths. A set of refinements and heuristics were applied to reorder the tasks as well as to make sure the cost was not more than five times the optimal solution obtained from [38]. This is the only work that directly approaches the MRR-MM problem as us and so we will use it for comparison in Section 6.
As seen from this section, the same problem has been formulated differently by various research communities and also been studied differently. However, there is a lack of common terminology between these communities and the problem is given different names. This indicates that any NP-hard problem should be studied across different research communities and a concise interdisciplinary terminology is needed to study and track the literature, which is part of an upcoming paper. We now introduce our unique problem formulation of the MRR-MM problem in the next section.

Permutation Matrix Approach
The Hungarian algorithm [57] is the fastest method for finding optimal solutions in a linear assignment (one-to-one) problem [58]. The algorithm seeks to find the assignment of tasks to the robots such that: (a) The total cost of the assignment is minimized, (b) each task is assigned to one robot and (c) each robot is assigned to one task. The algorithm expresses the problem in the form of a matrix, say, a non-negative N × N cost matrix C (the matrix element C ij is the cost of assigning j to i). The minimum cost is then found by permuting the rows and columns so that the trace of C is minimized.
where X and Y are permutation matrices. The permutation matrix approach for the MRR problem can be viewed as a generalization of these ideas to our problem, i.e., when the assignment becomes one-to-many, and the ordering of many tasks is also considered (as in case of our problem), how can the matrix be formulated for this type of problem?
Given, M robots and N tasks with a utility cost matrix ∆ (cost between tasks) and task vector T (cost for doing a task), we define a solution matrix S, of size (2M + N) × (2M + N), with binary variables: For an example with two robots (A and B) and three tasks (a, b, and c), the complete set of tasks is {SA, SB, a, b, c, EA, EB}, where S and E followed by the robot letter are the start and end positions of the corresponding robot respectively. The matrix S is then of size 7 × 7: The element s a,b is read as "the task a is followed by task b". In this matrix, there are many undesired transfers and first order loops (doing the same tasks again i.e., the diagonal elements) that can be avoided by fixing their corresponding matrix elements to zero, and the effective transfer assignment matrix looks like this: This effective matrix can have 120 permutation matrices (the permutation matrices are the extreme points of the set of doubly stochastic matrices). Which is the solution space of our problem. For example, one solution is, This corresponds to SA → b → EA (first robot task allocation) and SB → a → c → EB (second robot task allocation), which is read as "the robot A goes from start position, does task b and then goes to end position A", and "the robot B goes from start position, does task a followed by task c, and then goes to end position B". Our problem is now encoded in a matrix form and the search space is the list of all permutation matrices of the given size (for our example, it is 120). Then, the optimal solution to our problem is the valid permutation matrix with the minimum cost. Valid matrices do not have any loops. Loops route the robots to an already done task. In our example above, with a 5 × 5 sub-matrix, there are 120 permutation matrices, of which 24 are valid matrices without any loops. The validity of a solution matrix can be found by calculating the propagator [59] of the permutation matrix. Minimum cost matrix is the permutation cost with the minimum transport cost. The cost of the solution matrix can be found by using the values of the solution matrix applied on the utility cost matrix ∆ and task vector T, similar to the manner used in the Hungarian algorithm [57]. Or, we can also compute the transport cost using a design variable (explained later).
One can use any search method to find this "valid and minimum cost" permutation matrix, i.e., the optimal solution matrix. In our paper, we use two methods-Deterministic Annealing (DA) method [60] and Simulated Annealing (SA) [61] to search for this solution matrix. We later compare the results with the heuristic method in Section 6.

Methods
In this section, we briefly explain the three methods-SA, DA and Heuristic used to solve this problem before conducting experiments.

Simulated Annealing
SA is an iterative method that begins the search with a random valid solution, say S. This solution is then compared with another random solution S new in the neighbourhood and the system probabilistically decides about moving between the two solutions. This step is repeated until a given computation budget is exhausted. The SA method gives optimal solutions within few seconds for small problem sizes (n lesser than 20). However, for n > 10, the solution quality converges to global minima only as the time goes to infinity, i.e., asymptotically optimal. If the computational time is higher, the solutions get better.

Deterministic Annealing
Deterministic annealing [60] is based on applying the mean-field concept on the "Potts-spin model" with an annealing period [62]. To begin with, a Potts-spin doubly stochastic matrix V (size same as S) is initialized with small random values. Every element of the matrix is then updated according to an objective function termed as the local field U local . This local field consists of: U cost term which enforces a minimum make-span, the U loop term which imposes a penalty for solutions with loops, and the U stable term which dampens oscillations in the system. After every update of the V matrix, the matrix is normalized using the Sinkhorn normalisation [63] to maintain the matrix to be doubly stochastic. Depending upon the computational budget and the parameters of the objective function, the system eventually converges to a global minimum. Every element in the V matrix is updated according to the equation where, kT is the annealing temperature and the local field U local ij are the values of the matrix, where, P is the Propagator matrix given by, L max is the maximum value of the L vector given by and R max is the maximum value of the R vector given by, γ and η are the weighing factors for the loop and stabilisation functions respectively (the details of this method is given in an upcoming paper).

Heuristic Graph Search
A simpler method, inspired by Turpin et al.'s work, is devised for this work. In a graph G, the robot vertices are ignored and the shortest path connecting all the task vertices (shortest Hamiltonian path) is found. The problem of finding whether a given graph has a Hamiltonian path or not is a NP-hard problem. In case of a non-existent edge between certain tasks, then an infinite cost value can be assigned between those tasks. The process is also called edge evaluation. The problem of finding the shortest Hamiltonian path is similar to the TSP. There are many algorithms available for solving instances of tens and thousands of vertices in a TSP and hence, this can be computed. After obtaining the shortest Hamiltonian path (connecting the tasks), it is then cut into M number of chunks, based on the average chunk length or the largest edge costs present in this path. Now, the problem is reduced to a simple linear assignment problem between the M robot vertices and M chunks (sequences of tasks). Each sequence chunk's first and the last task vertex is assigned to one of the robots by using the Hungarian algorithm. The solution quality highly depends on how the overall Hamiltonian path is cut in the right places. Turpin et al. [26] in their paper did a number of pruning steps to get the right "cut" for obtaining very fast and good quality solutions.

Datasets
In this section, we solve the MRR-MM problem using the three different methods and study their results. However, before setting up the experiments, the data for the utility cost matrix ∆ and the T task vector that will be used for the experiments must be generated. We create four different datasets for the ∆ matrix. The T vector is generally assumed to be 1, i.e., the cost for doing a task is always 1 (for all tasks). Data are generated for varying problem sizes, from (M = 5, N = 5) to (M = 10, N = 50).
In dataset 1, random values for ∆ ij are drawn from a uniform distribution. The dataset 2 consists of random positions (x, y) that are considered to be the pose of robots and tasks. The Euclidean distances between these poses form the ∆ matrix for the dataset 2 which is a normal distribution. For datasets 3 and 4, we make use of 12 occupancy grid maps. Ignoring the outliers, the Figure 2 shows the distributions of the utility cost values in the four datasets. Experiments were conducted on the four different datasets to solve the MRR-MM problem using SA, DA and the heuristic search methods. All the three algorithms (https://github.com/jenniferdavid/taskallocation/tree/master/potts, accessed on 9 November 2021) were implemented using C++ with Eigen library and Matlab running on Ubuntu 16.04 with octa-core Intel i7 processor and 15.6 GB memory. Figure 3 shows the maps used for dataset 3 and 4 that are generated using Gmapping [64] from ROS packages, simulated worlds and in real world. For dataset 3, the utility costs ∆ ij are the spatial Euclidean distances between the robots and tasks (similar to dataset 2) but on the given static map. In each of these maps, random obstacle-free positions (x, y, θ) are chosen as the pose of tasks and robots. Obstacles in the map are ignored and each map has a different resolution. This dataset is a normal distribution and this simulation is only done for task allocation purpose as the robots' motion cannot be executed due to the presence of obstacles in the map. The dataset 4 considers the utility costs to be the actual path length cost between the robots and tasks in the same set of 12 occupancy grid maps as used for the third dataset. Random positions (x, y, θ) are chosen as the pose of tasks and robots and the path length between them is computed using an A* path planner [65]. This dataset is also a normal distribution. After the task allocation is calculated, the robots execute the motion in the given map environment, navigating with obstacles.

Scalability Comparison-Optimality and Computing Time
A comparison study was conducted on the scalability of these approaches in terms of best cost and computation time for each of the four datasets. Three experiments were conducted for different numbers of robots and tasks. In each experiment, the number of robots (  The results on dataset 1, with uniformly distributed values in the ∆ matrix, are shown in Figure 4. From the figure, it can be observed that the SA at 0.999 yields very good solutions and SA at 0.9 gave poor results. This is because of the longer annealing time which gives enough time for better solution convergence. However, the computation time is large for SA at 0.999 compared to SA at 0.9 due to the long running time. As expected, the heuristic method has the shortest computation time but the solution costs are not satisfactory. Similar results are obtained for dataset 2 as well which is shown in Figure 5 for both SA and heuristic. The DA method is able to produce comparatively good quality solutions in a considerable amount of time for dataset 1. However, for dataset 2, the solution quality of DA method deteriorates and performs poorer than SA at 0.99. The only difference between both the datasets is that the dataset 2 is generated from a normal distribution compared to the dataset 1 generated from uniform distribution. Experiments conducted for a size of M = 10 and N = 50 for all the four datasets is shown in Figure 6. In this case, SA is run at 0.9, 0.99 and 0.999 and DA at 0.9 and 0.99 respectively. These plots clearly show that the heuristic method is faster and produces good quality solutions in most of the cases (not all). The SA method produces good quality solutions at 0.999 but with higher computing time. The performance of DA method is good as compared to other methods but only with dataset 1. For the other datasets, the solution quality is poor for the DA method. In this case, SA at 0.99 is a better choice that has a proper trade off between both the parameters. These results show that the optimal choice of algorithm depends on the dataset size and the computational resources.

Motion Execution
For dataset 4 the ∆ values are the actual path lengths computed from the occupancy grid maps. In this case, the robots' motions can be executed to visualize the assignment and ordering of different tasks with respect to the robots. The simulation was conducted on differential-drive robots on a 2.5 simulator Stage (https://youtu.be/gXSRuJjHJDM, accessed on 9 November 2021) and ROS. One such visualization is shown in Figure 7 for M = 6 and N = 7. The robots are shown in blue, the tasks are marked orange, and the final drop off points are green. A "quick and simple" routing solution is to group each tasks to the nearest robot. This is shown in Figure 7a and gives a cost of 63. However, in this case, though the heuristic method produces faster solution cost of 80.3 in 5.2 s compared to other methods, the DA gives a better solution of 59.78 in 20.2 s. As expected, SA at 0.999 gives the optimal solution of 55.97 at 1033.57 s. From this experiment, it could also be seen that the expected solution need not be the optimal solution always.

Algorithmic Efficiency and Completeness
It should also be noted that SA is an anytime algorithm. One can stop the algorithm anytime during the annealing and have a valid solution. In contrast, both the Heuristic method and DA require a time frame to converge to a solution. The quality of the SA solution highly depends on the annealing rate. Both DA and heuristic algorithms run to completion, i.e., the solutions are obtained only after the algorithm comes to an end.
The heuristic method finds the Hamiltonian path connecting all the tasks, which is an NP-hard problem. The DA method needs to perform a Sinkhorn normalisation to keep the updated matrices to be doubly stochastic which is the most time-consuming part of the algorithm. Moreover, it also needs an annealing period for the solutions to converge to the global minimum.
SA always begins with an existent solution. The annealing period plays a vital role to find whether a solution exists or not for both SA and DA. The heuristic method, however, always returns a non-existent solution if no possible solution exists.

Solution Space
The solution space of this NP-hard problem exhibits factorial growth with the number of robots M and the number of tasks N, as shown in Table 1.
This can be explained as follows. For any pair of N and M the total number of solution matrices is given by, The total number of valid solutions (matrices without loops) is, The number of matrices with loops is

Discussion
In this section, we discuss the permutation matrix formulation in detail and how to extend this formulation to other variants of the problem.

Distributed Variant
This problem formulation is a centralized task allocation method that focuses on the global optimal solution. It assumes that all the robots and tasks are available, ∆ and T are known at all times and there is no communication issues in the system. The heuristic method also has the same assumptions. However, the centralized approach is not feasible for practical purposes. Hence, a distributed variant of this problem is required for a number of reasons. In real world scenarios, due to uncertainties, it is not possible to have an accurate ∆ utility cost matrix between the robots and tasks, especially for big problem sizes. More than the global optimal solution, one has to focus on the computing time, replanning and communication issues in such cases. So, the distributed variant of the problem is constructed by breaking down the problem into smaller sub-problems. The robots and tasks are first grouped based on a heuristic or k-means clustering and then the ∆ matrix and T are updated accordingly. This variant is useful in a scenario when one of the robots breaks down or is not available for task execution after the allocation is completed. If the problem is divided into sub-problems, then it is easy to address the robot breakdown by solving that specific sub-problem again. The distributed variant will also be useful in a case, when all the resources are not required to be used, i.e., the minimum number of robots could be employed instead of using all the robots in hand. However, as with any approach, distributed variant leads to sub-optimal solutions.

The Delta ∆ Matrices
In case of scenario like the scheduling of trucks in a container terminal [24], there are many other variables involved apart from the utility cost. It could be the arrival time of the trucks/robots, the waiting times for the robots, etc. In such a case, a proper pre-processing of these values is required to create a realistic ∆ utility matrix that captures the essence of the problem. For example, in Equation (4), some of the unwanted task orders are set to zero in the assignment matrix S. This is typically enforcing constraints on our problem formulation. Another way of implementing these constraints would be to set high cost for such task orders in the ∆ utility matrix and so are ignored during the minimum cost search. This shows that we can also enforce other constraints on the problem depending upon the requirement.

Precedence Constraints
In our problem formulation, there is no specific precedence ordering constraints on the tasks, i.e., "task 'b' SHOULD be done after task 'a'". The solution is obtained by ordering any tasks after any other tasks. However, when we are given a predefined set of task precedence orders, this constraint can be implemented in the problem formulation, by setting the corresponding utility costs to large values. In case of our example, in a predefined task order case of 3 tasks, say a, b, c as in, "task 'a' should be done before task 'b' which should be done before task 'c'", the lower diagonal of the utility cost matrix ∆ can be set to a large value. This precedence constraints reduces the solution space by half. Hence, for this case, SA (0.9) outperformed DA up to n < 200 problem size. Again, for bigger problem sizes, DA provided good results in terms of time and final cost.

Identical Utility Cost
When all the task orders are given the same values (sA = a = b = sB = .... etc.), it violates the initial constraints that is set. For example, when sA = a is set, it requires to set a = sA which is a dummy task. Hence, this case cannot be fit into our problem formulation as it does not give a permutation matrix solution set.
On the other hand, when identical utility costs are given for all the robots (sA = sB = eA = eB), the DA method produces solutions in which all the robots are given equal weights for the start and end (special) tasks. This implies that when there are identical robots with uniform utility cost, any robot could be used for the start or end tasks due to the identical utility cost. If the robots are identical (i.e., same cost to start and to end) then there will be M! identical cost values (including the minimum value). For example, if M = 3 then there are 6 possible matrices with the minimum value. The maximum number of possibly unique matrices for this case, is,

Linear Assignment Problem
Linear assignment problem is widely used in MRTA systems [66]. When M = N, the problem is reduced to a linear assignment problem. However, the assignment in our problem formulation is based on a different objective which is to minimize the maximum of the slowest robot. Future work will involve on reducing the present objective function to a linear assignment objective that can expand the scope of this method to more problems and its performance with the classical Hungarian algorithm.

Capacity Constraints
In case of multi-task system, where two or more robots are required to do the same single task, or when a robot is capable of doing two tasks at the same time, the problem formulation will change considerably and this will be a part of the future work.

Conclusions
In this paper, different problem formulations for the MRTA problem termed as the Multi-Robot Routing problem with min-max objective is discussed in detail. The paper highlights the importance of problem formulation and the lack of terminology across different research communities. The paper also explains a new problem formulation in detail-the permutation matrix formulation for this problem which proves to be more efficient and fast. The paper also provides a comparative study between the three different approaches SA, DA and heuristics search method with extensive experiments on real-world and simulated datasets to solve this problem. The results show that the optimal choice of algorithm depends on the dataset size, and the computational resources available.