An Adaptive Epidemiology-Based Approach to Swarm Foraging with Dynamic Deadlines

: Swarm robotics is an emerging ﬁeld that can offer efﬁcient solutions to real-world problems with minimal cost. Despite recent developments in the ﬁeld, however, it is still not sufﬁciently mature, and challenges clearly remain. The dynamic deadline problem is neglected in the literature


Introduction
Swarm robotics is an area of robotics that offers efficient solutions to complex realworld problems with minimal cost. One of the most popular swarm robotics tasks is swarm foraging, where swarms can be modeled for different foraging tasks, such as intrusion detection, search and rescue, the collection of hazardous materials, agricultural harvesting, and space exploration [1]. In swarm foraging, the robots collaboratively explore an environment and collect resources for transfer to a collection zone. This task was inspired by adaptive behavior observed in nature, as in insect colonies or bacteria. Foraging using a swarm of simple robots is important for improving performance while efficiently distributing the workload with a low cost [2]. It achieves effective task allocation by interacting with the environment to collaboratively perform complex tasks that are beyond the capability of one robot [3]. The robots should perform dynamic task allocation (DTA) to autonomously determine the search region. DTA partitions the robots into subgroups so that they can collaborate and perform the task together as quickly as possible.
addition to the distribution using the Friedman statistical significance test and visualized using box plots.
From practical experiments, it is possible to confirm that the result of using one collection zone is significantly worse than that of using multiple collection zones in terms of the expiration and foraging rates. Moreover, when comparing the DTA methods that use multiple collection zones, it is found that they have similar foraging rates but different expiry rates. From the flexibility experiments, it is concluded that hotspot DTA is best suited to the ED_Foraging approach. It produces consistent results for different map sizes, and it has an expiry rate that is either better than those of the other methods or similar to the best one in most setups. The same result is found when testing the scalability of the methods; that is, the results for the hotspot DTA method are consistent when the number of robots changes, while the other methods, especially ACO and centralized DTA, are highly inconsistent. Finally, in regard to robustness, it can be concluded that the hotspot DTA method is the least affected method. It aims to maintain an average behavior irrespective of the map size even though the resources are reduced.
The remainder of the paper is organized as follows: The related background is discussed in Section 2, and the related works are reviewed in Section 3. The proposed approach is described in detail in Section 4. The experimental results are discussed in Section 5, and the findings are discussed in Section 6. The conclusions and suggestions for future work are presented in Section 7.

Epidemiological Modeling
Epidemiological modeling plays an important role in shaping epidemic dynamics [13]. It can be used to model the transmission probability and, hence, predict the epidemic size, the chains of transmission throughout the population, and the likelihood of epidemic recurrence. Machine learning methods have been used to predict infection rates; however, these methods have not yet been successful in predicting infection hotspots and highlighting critical regions [15]. Statistical epidemic models, on the other hand, have been proposed and validated to model the spread of infectious diseases. Epidemic models, such as the SEIRD model [12], are well-established epidemiological modeling methods that can be used to understand the collected data and predict critical regions with different variables and dynamic environments.
The SEIRD model implicitly assumes mass interaction for an infinitely sized population such that the resources hosting the disease are mixed in the population and the members of the population have identical contact rates. In the SEIRD model, the population is split into five groups: susceptible, exposed, infected, recovered, and dead. In the model, S(t) corresponds to the number of susceptible individuals, who are initialized at time t but can become infected if they come in contact with an infected individual. E(t) corresponds to the number of exposed individuals, which are those who are infected but are not infectious yet. I(t) corresponds to the number of infected individuals, who have the disease at time t and may infect susceptible individuals. R(t) corresponds to the number of recovered individuals. Finally, D(t) corresponds to the number of dead individuals. The sum of all the groups should be equal to the population size, i.e., S + E + I + R + D = P, where P is the total number of individuals in the population.
In this model, natural death cases and births are neglected. Additionally, once an individual has recovered, it is assumed that they will be quarantined and, hence, will not be susceptible again. To predict the spread rate in a population, a parameter called R0 is used as an important indicator. R0 corresponds to the average reproduction number, which is the number of secondary infected individuals, who were infected by primary infected individuals [16]. R0 is a key quantity that predicts the evolution of an epidemic. Epidemiologically, a lower R0 is desirable because it means that the infection growth rate is smaller.

Swarm Foraging
Swarm robotics is a field of robotics in which a collection of robots collaborate with each other to complete a task [17]. It draws inspiration from animal societies and insect colonies, which have several advantages, such as flexibility, scalability, and fault tolerance [18]. The main characteristic of robots in this field is to be simple and resist communication failures; thus, they are more scalable than one complex robot. Hence, the cost of the robots is relatively low, and they can reallocate themselves.
One of the most popular swarm intelligent tasks is foraging. Foraging behavior can be observed in insect colonies such as ants and bees. In these colonies, the insects behave intelligently to select the richest source of food or the shortest path to a food source [18]. The foraging task can be divided into different subtasks, including exploring the environment, searching for food (resources), transporting food to a safe area called the nest, and navigating the environment while bringing back food. Hence, labor division (i.e., DTA) is necessary in foraging to allocate the robots and collect as many resources as possible.
DTA in swarm foraging involves dividing the task into interdependent subtasks and assigning the robots to perform them in parallel [3]. Since the main task in swarm foraging is to collect resources, the subtasks are regions of the foraging area, which are divided among the robots so that they can collect resources simultaneously. To improve the overall performance in the foraging task, the allocation should be dynamic and adaptive. DTA in foraging can be divided into three types: centralized, negotiation-based, and selforganized [18]. Centralized DTA requires a centralized coordinator that communicates with all the robots and allocates the tasks depending on the information it collects. In this type of allocation, availability and robustness are major concerns, and any communication issue with the central coordinator can cause a failure in the whole system, even when all the robots are working properly. Negotiation-based DTA allows all the robots to continuously communicate and negotiate which tasks to do. Even though this type of task allocation is very robust, it requires a high rate of communication among the robots, which can be difficult to achieve when the rescue area is sparse and the connections are limited. Finally, self-organized DTA allows each robot to make its decision autonomously based on the information it collects while having limited communication with its neighboring peers. Self-organized DTA is observed in nature, where peers autonomously make decisions with limited communication and no centralized control. This type of DTA is less prone to catastrophic failures than other types and is a better approach when the environment is changing rapidly.

Literature Review
DTA for swarm robotics has gained much attention in the last few years. In [10], a self-organized task allocation method was proposed to partition a task among multiple robots in a swarm with limited information about the environment. The proposed method does not require explicit communication among the robots. That is, each robot determines its foraging strategy and which subtask to perform based on its local estimate of the global execution time, which is updated after each task is completed. This method was tested in an ARGoS simulator, and it was found that the robots' decision-making improved, which led to overall improvements in performance. However, even though pheromone-based memory can be advantageous in foraging, it performs task allocation under ideal conditions, i.e., when all tasks are equally important and the robots have equal capabilities. Moreover, there can be a tradeoff between the improvement in performance and the scalability of the method. Chung et al. [19] also addressed the self-organized task allocation problem for swarms. They proposed a novel game-theoretical autonomous decision-making framework, which was tested in a dynamic environment. The main goal of their proposed framework was to achieve consensus among all agents regarding the best assignment for a current task. Even though the proposed framework was shown to perform well, it did not compared to existing methods. In [20], four different DTAs in swarm foraging were proposed. The first algorithm, Random-Choice, randomly selects the tasks and runs in a constant time. The second algorithm, Extreme-Comm, reduces the execution time but requires heavy communication among the robots. The third algorithm, Card-Dealer, sequentially assigns tasks to the robots to reduce communication but increases the execution time. The fourth algorithm, Tree-Recolor, combines Card-Dealer and Extreme-Comm to balance the execution time and communication. The algorithms were tested on a group of 25 iRobot SwarmBots. In the case of Random-Choice, communication was not required, but it had a high failure rate with small-to medium-sized swarms. Extreme-Comm was fast and accurate but required intensive communication that was heavily interrupted. Card-Dealer minimized the need for heavy communication, but the execution time was long because of its sequential task assignment. Finally, in the Tree-Recolor algorithm, problems arose due to communication hardware limitations. A two-step scheme for DTA was proposed by Wei et al. [6] in swarm foraging. It first partitions the task and then autonomously allocates the subtasks to the robots. From the empirical test, it was found that the proposed task allocation approach outperformed the conventional evolutionary-based approach and was more effective.
A DTA method was proposed by Lee et al. [3] to perform self-organizing swarm foraging based on a response threshold model to allocate sequentially interdependent subtasks. The proposed model considered the demand of a given task in addition to the response threshold for the task; the demand decreases while the robot performs the task, and a robot with a higher response threshold is less likely to perform the task. This model regulates the proportion of robots with respect to the fraction of the task demand. Another threshold-based self-organized DTA method for swarm foraging was proposed in [21], to autonomously change the number of working robots. A sigmoid model was used to compute the response threshold, and then each robot could decide whether it should forage to reduce the traffic and congestion in a region. Lee and Kim [8] proposed a decentralized strategy to improve the response threshold model for DTA in swarm foraging. A task selection probability function was proposed for each robot to balance the task demands with respect to the working robots. Although the environment was unknown, each robot could use its local information in addition to the number of tasks assigned to its neighbors. The robots determined their tasks based on their collected information even if the environment changed. From the empirical study, it was found that the proposed system converged to a certain task distribution to reduce the number of changes in the tasks. However, to maintain this performance, the division of labor among the robots needed global information about the environment. This was overcome by introducing two history queues for each robot, which increased the memory demand.
Nature-inspired DTA methods have also been adopted in swarm foraging. In [22], a particle swarm optimization (PSO) DTA method was proposed for swarm foraging. In this method, decisions are made independently by each robot without the need for continuous communication to avoid connection limitations. The simulation results showed that the proposed algorithm could adapt to changes in the environment. However, it is not applicable when the swarm size is large because it compromises between time and optimality. An improved PSO-based approach called potential field-based PSO (IPPSO) was proposed in [23] to perform swarm foraging in an unknown environment. IPPSO was designed to save computation time; its applicability was empirically evaluated in various scenarios to show that the foraging system could complete a task in an unknown environment.
Jevtic et al. [24] proposed the distributed bees algorithm (DBA), which was inspired by the foraging behavior of bees, where there is decentralized decision-making to achieve a global objective. The algorithm was performed with three physical robots, and a simulation was carried out to test its scalability over a large number of swarms. The experimental results showed that the DBA was scalable to the number of targets and robots while being adaptable to the targets' distribution. However, the optimization parameters need to be continuously controlled to adjust the robot behavior, which is difficult to generalize and requires tuning.
Another nature-inspired algorithm is ACO, proposed by Lu [1], which aims to be scalable and efficient for tens to thousands of robots. Lu proposed two algorithms: the multipleplace foraging algorithm (MPFA) and the central-place foraging algorithm (CPFA) [25]. The MPFA was further improved by introducing dynamic depots as special robots that could carry multiple resources [26]. Another extended version based on bioinspired hierarchical branching was published in [27]. To ensure the delivery of resources to the collection zones, the natural phenomenon of delivering blood from the heart to isolated cells in the cardiovascular system was adopted. Through Lu's work, the performance of the MPFA was compared with that of the CPFA on physical and ARGoS simulated robots. It was confirmed that the MPFA produced better foraging rates and travel/search times as well as a lower number of collisions compared to the CPFA. However, neither methods are sufficiently scalable, and they suffer from high collision rates. Moreover, ACO requires continuous parameter tuning, which trades off the search time with the traveling distance.
An ACO-based collective foraging method was proposed in [7], where robots coordinated in an unknown environment to collect resources. The robots simply used binary sensors with a minimal set of states to follow a reactive behavior mode. The proposed method managed to approximate the optimal allocation based on a quality-sensitive modulation of the pheromones. The swarm robotics proposed was used as a tool in understanding biological systems. Another ACO-based method for swarm foraging was proposed in [28]. A dynamic wave expansion neural network (DWENN) was developed based on a neural network to model pheromone diffusion. Based on mathematical modeling, this method determines the key parameters of foraging, including the number of robots assigned to different tasks. Two empirical tests were conducted and showed that the results agreed with the proposed model. However, the tested cases were simplified to produce an exact model, while foraging behaviors are complex.
Song et al. [28] developed a neural network-based pheromone model for swarm foraging. The main purpose of their proposed method was to determine the key parameters for swarm foraging by updating the neural network output based on the robots' pheromones. The reported results showed that the proposed algorithm may face difficulties in large environments. If there are obstacles, the robot might stay in the obstacle-avoidance mode for a long period of time, which means that it is not scalable or adaptable to large or dynamic environments, and further improvements are needed. Nevertheless, bacteria-like behavior is another nature-inspired concept that has been adapted to DTA in swarm foraging. Kurdi et al. [29] proposed a bacteria-inspired heuristic for DTA in unmanned aerial vehicles (UAVs). The proposed approach was tested with three different heuristics and was found to maintain a steady execution time in different environmental contexts. However, the proposed method is problem-specific, and the tasks were assumed to be independent.
The literature discussed above assumes that tasks have equal priority, and deadlines are not considered. However, deadlines are sometimes incorporated into swarm foraging to reflect real-life problems. In this case, the performance of the system is related not only to how fast it can collect food but also to how much food is delivered on time. Thus, methods of swarm foraging have been improved to include temporal constraints. Khaluf et al. [30] developed a new swarm foraging task with soft deadlines and proposed a mathematical model that can be used to analyze the performance of swarm foraging. Their proposed model allows robots to select their tasks autonomously using a decision matrix, which defines the probability of robot behavior. This probabilistic design was tested in an ARGoS simulator and was found to be efficient when the cost of switching between tasks is neglected. Wei et al. [2] further studied temporal constraints with swarm foraging. In their study, the collected resources were categorized into different types. Although the robots have no prior knowledge about the locations of the targets in the environment, they need to deliver the targets in chronological order depending on their types. A prediction auction-based approach was also proposed, where the robot allocates itself while it executes its tasks. The robot predicts the tasks that need to be performed in the future when it is idle. To test the performance of the proposed approach, the blocks world for teams (BW4T) simulator was used to determine that there was a trade-off between the energy consumption and execution time.
An ACO-based algorithm was proposed by Vanhee [17] to distribute robots into subtasks and finish before a certain deadline. The proposed algorithm was tested on an ARGoS simulator to determine that ACO can be used for foraging tasks with deadline constraints. However, continuous parameter tuning is required to successfully complete the task. Khaluf and Rammig [31] proposed a self-organized DTA strategy to assign timeconstrained tasks to swarm robots. A probability allocation matrix was proposed with respect to different specifications, including the robot performance, soft task deadline, and task size. The main goal was to allow each robot to assign itself to a task depending on its deadline. From the ARGoS simulator results, it was found that even though the proposed strategy could adapt to online arrival tasks, it required neglecting the costs of dropping one task to switch to another. Another ACO algorithm was proposed by Khaluf et al. [9] to efficiently assign time-constrained tasks to swarm robots. The proposed algorithm allows the robots to communicate with other robots and periodically evaluate the quality of the allocation using pheromone trails to reduce the execution time. It was tested in static and dynamic environments to show that it performed well with a small number of robots, and parameter tuning was required for a dynamic environment where the optimal values were unknown.
Chung et al. [19] surveyed the literature on aerial swarms and argued that there is a need to improve the methods currently used in swarm robotics. There is an important tradeoff between robustness, efficiency, and performance in the current literature. Moreover, this study revealed that swarm robotics is still an open problem and that methods need to be improved in different areas, including decision-making in an unknown environment. Ultimately, it can be concluded that studies on swarm foraging with time constraints are still limited. In particular, the problem of dynamic unknown deadlines has not been considered in the current literature, and time constraints are usually configured before starting the foraging task. However, in reality, deadlines are continuously changing and cannot be predefined. Therefore, further improvements are needed, and dynamic deadlines should be modeled.

System Design
Inspired by MPFA modeling [25], ED_Foraging can be defined as a search-and-rescue mission for individuals (resources) with dynamic deadlines. To clarify how the system is designed to allocate the tasks throughout the foraging journey, this section discusses the model proposed to represent dynamic deadlines and then presents the problem definition to show how the model is designed for the foraging task.

Dynamic Deadline Model
In the past few months, due to the global pandemic of the novel coronavirus disease 2019 [16], epidemiological modeling has been evolving quickly to enable researchers to study virus transmission and model it as accurately as possible. The way the virus behaves and dynamically affects different people is very interesting. Thus, different aspects of this epidemic model can be used to generalize the adaptation and change over time to construct a mathematical model for swarm foraging to predict unknown dynamic deadlines.
To understand how the proposed mathematical model is adapted from epidemiological modeling, it is important to first identify similarities between the two concepts. In swarm foraging, robots move resources into collection zones before they expire. When the deadlines are dynamic, the robot must know its transmission rate to predict the likelihood of deadline expiration in a certain region. Epidemiological modeling can be utilized to predict such transmission, where predicting possible exposure in a population to help victims before they die is an analogy for predicting deadline expiration in an area to move Appl. Sci. 2021, 11, 4627 8 of 28 resources into collection zones before they expire. Thus, infected resources are an analogy for constrained resources.
As shown in Table 1, ED_Foraging performs its task in a 2D arena (A). The size of this arena represents the epidemic size in epidemiological modeling. Multiple robots (R) are used to collaboratively accomplish the task, which involves moving resources from one place to another. The resources should be moved to a depot collection zone (N d ) temporarily until they are moved to the central collection zone (N c ). The resources (N g ) in ED_Foraging represent the individuals in an epidemic population. Moreover, the deadlines are represented by T ς , where ς represents the resource state. Hence, the infection state is mapped to a certain time constraint that represents the deadlines and can be updated online using the proposed epidemic model. Note that each infection state is taken as a deadline, and the death state symbolizes deadline expiration. The transmission of an infection throughout the population can be used to update the deadlines dynamically and predict the likelihood of epidemic recurrence to assign the robots to the most critical regions. Table 1. ED_foraging analogy.

SWARM Foraging Symbol
Inspired by the SEIRD model [12], a dynamic deadline model called the DD_Model is proposed in this paper to model the dynamics of the time constraints. The DD_Model is used to predict the state of each resource and update their deadlines online during the task. In particular, there are six states that a resource can be in (transmission, susceptible, exposed, infected, recovered, and dead). The following parameters are used in the DD_Model: • α corresponds to the incubation rate; • β corresponds to the exposure rate; • γ corresponds to the recovery rate; • η corresponds to the mortality rate; and • ε corresponds to the duration of immunity for a resource after moving to a recovery state.
Each resource has its own deadline, which is computed with respect to the current time and state. A resource state is governed by one of the deadlines below, where T is the current time, T 0 is the initial time before any state starts, and R exp is a uniform random exponential function with a value that ranges from zero to the rate parameter.
Incubation period T I -the average time for a resource to enter the infected state after the exposed state.
Transmission time T t -the average time for a resource to contact other resources, during which it can infect its neighbors, before the recovered state.
Recovery time T r -the time needed to reach the recovered state after entering the infected state.

of 28
Susceptibility time T s -the time needed for a resource to enter the susceptible state after the recovered state without entering the exposed state.
Expiration time T d -the time before a resource expires after being in the infected state. Table 2 shows how a resource stays in each deadline state as long as the time of that state has not expired. Note that the deadlines are continuously updated depending on the current time and previous state. A uniform random exponent is used in the equations to generate different deadlines for the resources. This is of particular importance for increasing the dynamics of deadlines where the time taken to transition from one state to another differs between resources. Table 2. Deadlines in the DD_model.

State Deadline
Exposed Finally, when a resource is in the infected state, it is possible to predict how many other susceptible resources may enter the infected state in the future. This is carried out through the reproduction parameter R0, which is computed based on the current population state, as shown in (6), where E(t), I(t), R(t), and D(t) are the numbers of exposed, infected, recovered, and dead resources, respectively.

Foraging Task Model
For a swarm of R robots searching for N g resources that have deadlines T ς , the task can be formulated as follows: Definition 1. The robots are divided into two types: searching robots (R s ) and foraging robots (R f ). The searching robots are responsible for searching for resources and moving them to the depots. The foraging robots are responsible for delivering resources from the depots to the central collection zone.

Definition 2.
The velocity of R f is constant across all experiments, and R s can always carry one resource at a time.
Definition 3. The collection zones are divided into two types: a central collection zone N c and distributed zones N d (depots). These zones are distributed uniformly and are known to the robots. The collected resources are moved to the closest zone; then, they are moved to the central collection zone. For each search region, there is a specified area A r with a collection zone in the center. The number of search regions is N r, so the total arena area is A = N r A r .

Definition 4.
A resource is associated with a list of events (L t ) representing its states, where L t {S, E, I, R, D}. Each event is constrained by an online deadline (T ς ) calculated using the DD_Model to predict future states and their deadlines.

Definition 5.
T ς is the time representing the deadline of each resource, where ς {I, t, r, s, d}. Each deadline is updated continuously using the DD_Model, as explained in Equations (1)-(5). Definition 6. The site density (D t ) is the density of an area A r based on the number of resources (N g ) in A r weighted by their current states, where N g is distributed uniformly in A r . D t can be measured using (7), where ∝ state(i) is the state weight of a resource (i) in proportion to the states in the DD_Model: Definition 7. The spot intensity (H t ), which is also called the hotspot level, is the predicted maximum energy of a spot in an area A r based on the average time of resource expiration in that spot predicted by the DD_Model. H t is computed using (8), where f(n) is the expected expiration time T d of the expected expired events in A r and C(d) Ng is the number of resources N g that are predicted to die in A r .
Definition 8. The collected resources are replenished after collection or expiration to maintain a consistent density. When a resource is collected, another resource is placed in a location drawn from a uniform random distribution. Moreover, when a resource expires, another susceptible resource is placed in another location with a new constraint based on the DD_Model.

Definition 9.
The foraging rate (F) is the number of resources collected in the central collection zone N c per unit time. The foraging rate in a region is F d , which represents the number of resources transported to the regional collection zone N d .

Definition 10.
During the search, the robots use a search probability ρ (0; 1) to decide whether they should travel based on the selected locations (H t or D t ) or switch to a uniform random search at each time step. This probability controls the distance and time that each robot travels in a straight line and allows flexibility in searching the area based on a uniform random distribution.
At the beginning of the task, the resources are randomly distributed around the area, and the DD_Model is used to randomly change the resource states to infected and set their initial deadlines. The robots in this task have limited communication ranges and sensing capabilities, yet they need to work together to collect as many resources as possible before their deadlines. The resource locations are unknown to the robots; hence, they blindly search while collecting information about their surroundings and report the information when delivering resources to the closest zone they find.
All types of robots can move globally around the area but searching robots R s return to the closest depot when they collect a resource. When a resource is collected by an R s , the D t and H t of its surroundings are computed and reported to the closest collection zone. This information can be used by other robots to give priority to intense spots. Robots that detect a high density of resources D t in a certain region have a high probability of returning to that region. This probability is defined by a Poisson cumulative distribution function (CDF) POIS (D t ; p), as shown in (9), where p is the reliability parameter set by the user. The robot returns to its previous location when the parameterized Poisson CDF exceeds a uniform random value, i.e., POIS (D t ; p) > ρ (0; 1); otherwise, the robot returns to one of the hotspots in its region, if any exist. If there are no hotspots, then the robot switches to a uniform random search.
The main objective of this task is to reduce the number of expired deadlines by observing the environment and updating D t and H t accordingly to collect resources as quickly as possible to give priority to hotspots.

Methods and Algorithms
After formulating the problem at hand, we now show how the ED_Foraging approach works and explain the details of the proposed DTA method, called Hotspot DTA. As illustrated in Figure 1, the approach starts by initializing the state of all resources to susceptible and randomly distributing them around the arena. Then, it randomly changes some of the resources' states from susceptible to exposed via the DD_Model. Finally, robots are dispatched from the collection zones to follow randomly selected paths, and the constraints are updated online at each time.
the Dt and Ht of its surroundings are computed and reported to the closest collection zone. This information can be used by other robots to give priority to intense spots. Robots that detect a high density of resources Dt in a certain region have a high probability of returning to that region. This probability is defined by a Poisson cumulative distribution function (CDF) POIS (Dt; p), as shown in (9), where p is the reliability parameter set by the user. The robot returns to its previous location when the parameterized Poisson CDF exceeds a uniform random value, i.e., POIS (Dt; p)> ρ ϵ (0; 1); otherwise, the robot returns to one of the hotspots in its region, if any exist. If there are no hotspots, then the robot switches to a uniform random search.
The main objective of this task is to reduce the number of expired deadlines by observing the environment and updating Dt and Ht accordingly to collect resources as quickly as possible to give priority to hotspots.

Methods and Algorithms
After formulating the problem at hand, we now show how the ED_Foraging approach works and explain the details of the proposed DTA method, called Hotspot DTA. As illustrated in Figure 1, the approach starts by initializing the state of all resources to susceptible and randomly distributing them around the arena. Then, it randomly changes some of the resources' states from susceptible to exposed via the DD_Model. Finally, robots are dispatched from the collection zones to follow randomly selected paths, and the constraints are updated online at each time. To allow a swarm of robots to work together and collect resources without prior knowledge about their distributions or deadlines, the hotspot DTA method is proposed using the Robot_Search_and_Rescue function, which is continuously executed, as shown in Figure 2a. In this function, two important actions are configured. The first action is that when the robot finds a resource, it picks it up and delivers it to the closest collection zone. While the robot is picking up the resource, the robot scans its surroundings to compute To allow a swarm of robots to work together and collect resources without prior knowledge about their distributions or deadlines, the hotspot DTA method is proposed using the Robot_Search_and_Rescue function, which is continuously executed, as shown in Figure 2a. In this function, two important actions are configured. The first action is that when the robot finds a resource, it picks it up and delivers it to the closest collection zone. While the robot is picking up the resource, the robot scans its surroundings to compute its spot intensity H t and local density D t . As described in (7) and (8), D t is measured to allow the robot to remember the location of a previously found target and return to the same location if it is dense, while H t is computed based on the average expiration time of the expected expired deadlines surrounding the robot to prioritize more critical locations.  Assigning a robot to a hotspot using its weight, initialized as Ht.
In the beginning, the Update_DD_Model function checks whether there is a deadline at the current time. If no active deadline is found, the model randomly adds deadlines by changing the state of the susceptible resources using the infect function to introduce new deadlines to the task. On the other hand, if there are resources with deadlines, the earliest deadline state is checked. If the deadline is greater than the current time, the resource state is updated to the next state, and hence, the deadline becomes more critical. The state of the earliest resource is updated depending on its current state and the deadline via the update_constraint function. After updating the deadline, the new resource state is validated. If the new state is dead, it means that Td has been reached and the resource deadline has expired. Thus, it should be removed from the population and reassigned to a new location as a new susceptible resource.
In the infect function (Figure 3b), the expected deadlines of the selected resources are computed with respect to the current time using (1)- (5). Then, the current state of the infected resource is set as exposed. To avoid computation overload, the expected deadline and future state of an infected resource are stored in a queue list called InfectionList. This list represents the event list (Lt) and contains all expected events in the population, arranged in order with respect to their deadlines as shown in Table 2. Thus, the states following exposure are added for the current resources, including infection, recovery, The second action is that when the robot reaches a collection zone, it delivers the resource that it is carrying. In this case, the robot reports the collected information (H t ) to the collection zone, which is shared with the other robots. Then, the robot checks the density D t using the Poisson CDF, as discussed in the previous section. If the Poisson CDF < ρ (0; 1), then the last visited site is chosen, and the robot returns to a pre-visited region with a high density of resources. However, if the Poisson CDF condition is not satisfied, the robot checks whether there are any hotspots around it. Using the HotspotExist function, the robot scans the hotspot list stored in the current collection zone to choose intense regions based on H t (Figure 2). The chosen hotspot is then considered as the navigation target so that the robot goes to and searches for resources in that area. Note that the spot intensity (weight) is initialized by H t via (8) and updated continuously by all the robots using the Update_Hotspot_Intensity function; this function will be discussed later in this section.
Going back to Figure 2a, if there is no intense hotspot or dense region found, the robot randomly searches any region. During the random walk, if the searching time exceeds the search probability, the robot gives up searching that region and randomly searches in another region. While the algorithm is executing the Robot_Search_and_Rescue function, it also updates the DD_Model and the deadlines accordingly. From Figure 3a, the Update_DD_Model function is used to continuously check the constraints and update the resource state. In the beginning, the Update_DD_Model function checks whether there is a deadline at the current time. If no active deadline is found, the model randomly adds deadlines by changing the state of the susceptible resources using the infect function to introduce new deadlines to the task. On the other hand, if there are resources with deadlines, the earliest deadline state is checked. If the deadline is greater than the current time, the resource state is updated to the next state, and hence, the deadline becomes more critical. The state of the earliest resource is updated depending on its current state and the deadline via the update_constraint function. After updating the deadline, the new resource state is validated. If the new state is dead, it means that T d has been reached and the resource deadline has expired. Thus, it should be removed from the population and reassigned to a new location as a new susceptible resource.
In the infect function (Figure 3b), the expected deadlines of the selected resources are computed with respect to the current time using (1)- (5). Then, the current state of the infected resource is set as exposed. To avoid computation overload, the expected deadline and future state of an infected resource are stored in a queue list called InfectionList. This list represents the event list (L t ) and contains all expected events in the population, arranged in order with respect to their deadlines as shown in Table 2. Thus, the states following exposure are added for the current resources, including infection, recovery, transmission, susceptibility, and death. In each state, the name of the state is added along with its deadline and the expected resource index.
In the update_constraint function (Figure 3c), when the state is infection, recovery, susceptibility, or death, the subsequent state of the chosen resource is updated based on L t as computed by the infect function (Figure 3b). If the new state is transmission, this means that the resource might change its neighboring resource states to infection until its own state is changed to recovery. Thus, based on the reproduction parameter R0 computed in (6), random neighbors of the current resource are selected, and their state deadlines and event lists are updated via the infect function (Figure 3b).
Throughout the ED_Foraging task, the robots continuously search for resources, giving priority to dense and intense spots. During the search, the deadlines are dynamic, and thus, the Update_Hotspot_Intensity function ( Figure 4) is used to continuously update the hotspots based on the current infection state of each region. This function updates the intensity of the recorded spots added in the Robot_Search_and_Rescue function (Figure 2a). The intensity of the spots predicted to have expired resources is updated by subtracting the current time from the predicted expiration time (computed using (8)). However, when a spot is predicted not to expire, which applies to spots that have noncritical resources, the intensity is reduced exponentially with respect to the current time, where y is the decay probability set by the user.  Throughout the ED_Foraging task, the robots continuously search for resources, giving priority to dense and intense spots. During the search, the deadlines are dynamic, and thus, the Update_Hotspot_Intensity function ( Figure 4) is used to continuously update the hotspots based on the current infection state of each region. This function updates the intensity of the recorded spots added in the Robot_Search_and_Rescue function (Figure 2a). The intensity of the spots predicted to have expired resources is updated by subtracting the current time from the predicted expiration time (computed using (8)). However, when a spot is predicted not to expire, which applies to spots that have noncritical resources, the intensity is reduced exponentially with respect to the current time, where y is the decay probability set by the user. After updating the intensities of all hotspots, if the weight of a spot is almost equal to zero, then it is no longer considered a hotspot and is predicted to have no critical resources at the current time. Thus, it should be removed from the hotspot list. Throughout the task, the robots continuously search for resources, giving priority to hotspots with infected resources that could die if not rescued. During the search, the deadlines are dynamic and are changed online, and the DD_Model is used by the robots to predict important regions. After updating the intensities of all hotspots, if the weight of a spot is almost equal to zero, then it is no longer considered a hotspot and is predicted to have no critical resources at the current time. Thus, it should be removed from the hotspot list. Throughout the task, the robots continuously search for resources, giving priority to hotspots with infected resources that could die if not rescued. During the search, the deadlines are dynamic and are changed online, and the DD_Model is used by the robots to predict important regions.

Experiment
To measure the performance of the proposed ED_Foraging approach, it was implemented in an ARGoS simulator [14,32] with foot-bot robots. As illustrated in Figure 5

Experimental Components
The experiments were run on an i7 machine with 16 GB RAM. Each experiment was repeated 40 times to record the average and specific performances. The maximum simulation time of each trial was 30 min. The rate parameters in the DD_Model were initialized based on the Hladish et al. [13] study, where α = 0.5, β = 0.33, γ = 0.17, and ε = 30. It was assumed in the experiments that any resource could have a deadline and expire if not collected before its expiration time. Thus, the mortality rate was initialized as η = 1 − γ.
To study how ED_Foraging can manage online deadlines in an unknown environment, the proposed foraging approach was tested with different DTA methods, including hotspot DTA. The first three methods are self-organized DTA, and the fourth method is centralized DTA.


Hotspot-The proposed DTA method discussed in Section 4.  ACO-Adopted from the MPFA, which is ACO-based DTA with multiple collection zones, where ACO parameters are initialized as in [25].  Blind-Implemented to allocate robots randomly and search blindly in multiple collection zones. No information is reported on the zones or exchanged between the robots.  Centralized-A traditional DTA for swarm foraging, where there is only one collection zone (central) and the search is blind and random.
Throughout the experiments, different measures were collected. The first measure was the foraging rate, which represents the number of resources delivered to the central collection zone (a higher foraging rate corresponds to a better performance). The second measure was the expiry rate, which represents the percentage of resources with expired deadlines. A lower expiry rate corresponds to a better performance. The expiry rate was computed by dividing the number of resources that died (expired deadlines) by the total

Experimental Components
The experiments were run on an i7 machine with 16 GB RAM. Each experiment was repeated 40 times to record the average and specific performances. The maximum simulation time of each trial was 30 min. The rate parameters in the DD_Model were initialized based on the Hladish et al. [13] study, where α = 0.5, β = 0.33, γ = 0.17, and ε = 30. It was assumed in the experiments that any resource could have a deadline and expire if not collected before its expiration time. Thus, the mortality rate was initialized as η = 1 − γ.
To study how ED_Foraging can manage online deadlines in an unknown environment, the proposed foraging approach was tested with different DTA methods, including hotspot DTA. The first three methods are self-organized DTA, and the fourth method is centralized DTA.

•
Hotspot-The proposed DTA method discussed in Section 4. • ACO-Adopted from the MPFA, which is ACO-based DTA with multiple collection zones, where ACO parameters are initialized as in [25].

•
Blind-Implemented to allocate robots randomly and search blindly in multiple collection zones. No information is reported on the zones or exchanged between the robots. • Centralized-A traditional DTA for swarm foraging, where there is only one collection zone (central) and the search is blind and random.
Throughout the experiments, different measures were collected. The first measure was the foraging rate, which represents the number of resources delivered to the central collection zone (a higher foraging rate corresponds to a better performance). The second measure was the expiry rate, which represents the percentage of resources with expired deadlines. A lower expiry rate corresponds to a better performance. The expiry rate was computed by dividing the number of resources that died (expired deadlines) by the total number of resources scheduled (expected) to die.

Experimental Design
To study the performance of ED_Foraging, three experiments were performed to assess its flexibility, scalability, and robustness. As shown in Table 3, the first experiment measured the flexibility of the approach with different setups, where the number of robots and resources and collection zones changed with respect to the map size. Each experiment was tested with different configurations, from small to very large maps and numbers of robots, depots, and resources. The second experiment, however, measures the scalability of the approach by changing the number of robots and fixing the rest of the configuration. The map size was fixed at 16 × 16 because DTA methods using multiple collection zones (i.e., ACO, hotspot, and blind) ended the experiments before moving the resources to the central zone. This means that, in this map, the robots were scalable enough to finish collecting all the resources before moving them to the central zone. Thus, it was important to study this behavior and further investigate the scalability of each method. Note that since resources were not moved to the central zone in the 16 × 16 map, the foraging rate represents the total number of resources collected in all zones, i.e., Σ ∀N d F d .
Finally, the third experiment measured the robustness of the approach by changing the map size, and accordingly the collection zones, and fixing the number of robots. Note that the centralized method uses only one collection zone, regardless of the map size.

Experimental Results
As mentioned previously, two measures were quantified in each experiment and three experiments with different setups were conducted. In this section, the average results of the foraging and expiry rates for each experiment are discussed. Then, the expiry rate is further investigated by presenting the detailed results of the 40 trials. Finally, based on the Friedman statistical significance test, the distribution of the expiry rate is visualized in box plots to discuss how each method behaves in each setup.

Flexibility
To analyze the flexibility of ED_Foraging with different setups, the average performance over 40 trials was collected for both the foraging and expiry rates ( Figure 6). From Figure 6a, it is clear that the DTA methods that use multiple collection zones, i.e., hotspot, ACO, and blind, have almost the same foraging rates, while centralized DTA produces the lowest foraging rate in most maps. These results are in agreement with those of Khaluf [18], where self-organized DTA is better when the environment changes rapidly. Moreover, in regard to the expiry rate (Figure 6b), centralized DTA also produces the worst average expiry rate in all cases, indicating that it is better to have multiple collection zones to treat patients before they die. Nevertheless, ACO, hotspot allocation, and blind allocation behave differently, and the expiry rate results need to be further investigated to compare their performance in each map. However, it can be concluded that hotspot DTA produces the best average expiry rate in most maps when applied to ED_Foraging, followed by ACO and blind DTA. produces the best average expiry rate in most maps when applied to ED_Foraging, followed by ACO and blind DTA.  To further investigate the expiry rate in each map and clarify the differences between the methods, the detailed results of each trial are shown in Figure 7a-e. In all the maps, it is clear that the centralized method produces significantly worse results than the other methods. In the 4 × 4 and 8 × 8 maps (Figure 7a,b), ACO, hotspot, and blind DTA perform very similarly in all the trials and produce almost the same expiry rate, although the hotspot method produces slightly lower values. However, when the map size is increased, the differences become more apparent. To further investigate the expiry rate in each map and clarify the differences between the methods, the detailed results of each trial are shown in Figure 7a-e. In all the maps, it is clear that the centralized method produces significantly worse results than the other methods. In the 4 × 4 and 8 × 8 maps (Figure 7a,b), ACO, hotspot, and blind DTA perform very similarly in all the trials and produce almost the same expiry rate, although the hotspot method produces slightly lower values. However, when the map size is increased, the differences become more apparent.
In the 16 × 16 map (Figure 7c), the hotspot DTA has the lowest expiry rate in most trials, followed by the ACO and blind methods. In the 20 × 20 map (Figure 7d), the results for the hotspot and ACO methods coincide for the first half of the trials, indicating that they perform similarly, while the blind method is worse than both. However, for the next half of the trials, the hotspot and blind methods are similar, while the ACO method is worse. Finally, in the 40 × 40 map (Figure 7e), the hotspot and ACO methods perform similarly for the first half of the trials, while the blind method is worse than both. Then, all three methods have similar expiry rates for the next half of the trials.
To further investigate the distribution of the results in each map and see how hotspot DTA performs compared to the other methods, the distributions of the results are shown in Figure 7f as box plots. In all the maps, it is clear that centralized allocation performs the worst, as previously found. The result of this method is highly variable, indicating its inconsistency between trials. Moreover, the minimum-maximum-mean lines in the plot show that centralized allocation produces a significantly higher expiry rate than the other methods. In regard to the ACO, hotspot, and blind methods, the distributions (Figure 7f) are different between the maps depending on their sizes. In the 4 × 4 map, all three methods have similar minimum-maximum-mean values, but the results for the hotspot method are more compressed than those for the ACO and blind methods. This indicates that the hotspot method is more consistent with small maps when comparing the expiry rate.   In the 16 × 16 map (Figure 7c), the hotspot DTA has the lowest expiry rate in most trials, followed by the ACO and blind methods. In the 20 × 20 map (Figure 7d), the results for the hotspot and ACO methods coincide for the first half of the trials, indicating that they perform similarly, while the blind method is worse than both. However, for the next half of the trials, the hotspot and blind methods are similar, while the ACO method is worse. Finally, in the 40 × 40 map (Figure 7e), the hotspot and ACO methods perform similarly for the first half of the trials, while the blind method is worse than both. Then, all three methods have similar expiry rates for the next half of the trials.
To further investigate the distribution of the results in each map and see how hotspot DTA performs compared to the other methods, the distributions of the results are shown in Figure 7f as box plots. In all the maps, it is clear that centralized allocation performs the worst, as previously found. The result of this method is highly variable, indicating its In the 8 × 8 map, the ACO, hotspot, and blind methods have similar mean values, but the hotspot method has the lowest minimum value; it also has the lowest maximum value, along with the blind method. The ACO method produces slightly worse minimummaximum results while having a similar dispersion as the hotspot method. On the other hand, the expiry rate for the blind method is more compressed but has more outliers that could affect the results in the long run.
In the 16 × 16 map, it is clear that the results of the hotspot method are significantly better than those of the other methods. It has lower minimum-maximum-mean expiry rates and is also highly compressed compared to the other methods. This indicates that the hotspot method performs better than the other methods and can better handle the increase in the map setup. In regard to the 20 × 20 map, the hotspot method is similar to the ACO and blind methods with a slightly lower expiry rate. Unlike the centralized method, the ACO, hotspot, and blind methods are positively skewed, which means that most of the expiry rate results are on the lower end of the scale (shown as the mean line on the bottom of the box). Moreover, the results of blind DTA are more compressed than those of the other methods in this setup.
Finally, in regard to the 40 × 40 map, although the ACO, hotspot, and blind methods have similar dispersions, it is very clear that the results of the hotspot method are positively skewed, while those of the ACO and blind methods are negatively skewed. This indicates that most of the expiry rate results of the hotspot method are on the lower end of the scale, and the results of the ACO and blind methods are on the upper end of the scale.
From the results of all the trials discussed, it is clear that hotspot DTA gives results that are either better than or similar to those of other benchmarks tested with ED_Foraging. Moreover, it has better performance than the centralized method and is more consistent than the ACO and blind methods, regardless of the map setup. The methods using multiple collection zones behave similarly in small map setups, but the differences become more apparent as the map size is increased. Therefore, the next sections will further investigate the scalability of the methods and their robustness to change.

Scalability
This section analyzes to what extent the algorithm is scalable with swarms that range from tens to thousands. This is of particular importance because increasing the number of robots produces more inter-robot collisions, which affects the scalability of the algorithms. Thus, adding more robots does not necessarily improve performance, and the algorithm's behavior in such setups needs to be further investigated. Starting from the total foraging rate of all zones, i.e., ∑ ∀d (F d ) (see Figure 8a), it can be seen that having a very small or very large number of robots in a medium-sized map is not a good choice and negatively affects the foraging rate. From the figure, the best result is obtained when the number of robots is intermediate (136) with a 16 × 16 map. This confirms the conclusion in the literature, where it was emphasized that increasing the number of robots can negatively affect the foraging rate and that it is better to set the number of robots in proportion to the map size.  One interesting observation from Figure 8a is that the foraging rate is affected more by the zone's setup than by the allocation method used. In particular, all multiplecollection-zone methods (ACO, hotspot, and blind) have similar foraging rates that are better than that of the centralized method, which has only one collection zone. Only in the case when the number of robots is very small (24 robots) does the centralized method produce slightly better results than the other methods. This is because having a very small number of robots reduces the collision and hence improves the foraging rate. Therefore, when increasing the number of robots around one central collection zone (i.e., 112, 136, 144, and 592 robots) in Figure 8a, the foraging rate is significantly affected in the centralized method, to the point that the results are less than half of what the other methods are able to collect.
In regard to the average expiry rate (Figure 8b), each algorithm behaves differently One interesting observation from Figure 8a is that the foraging rate is affected more by the zone's setup than by the allocation method used. In particular, all multiple-collectionzone methods (ACO, hotspot, and blind) have similar foraging rates that are better than that of the centralized method, which has only one collection zone. Only in the case when the number of robots is very small (24 robots) does the centralized method produce slightly better results than the other methods. This is because having a very small number of robots reduces the collision and hence improves the foraging rate. Therefore, when increasing the number of robots around one central collection zone (i.e., 112, 136, 144, and 592 robots) in Figure 8a, the foraging rate is significantly affected in the centralized method, to the point that the results are less than half of what the other methods are able to collect.
In regard to the average expiry rate (Figure 8b), each algorithm behaves differently from one test to another. Thus, the detailed results of each trial will be investigated next. However, in general, it can be noted that the centralized method is highly inconsistent, as it significantly reduces the average expiry rate when the number of robots is small while significantly increasing it when the number of robots is large. This indicates that having one collection zone is not a good choice, especially when we are targeting sensitive search and rescue missions.
Going deeper into the detailed results of the expiry rate (Figure 9a-e), it can be clearly seen that the centralized method is inconsistent. Its expiry rate starts from a very low level (a zero expiry rate) when the number of robots is 24 (Figure 9a) and rises as the number of robots increases (Figure 9b-d) until it becomes significantly higher than those of the other methods, with an 80% expiry rate when the number of robots is 592 (Figure 9e).
In regard to the ACO method, inconsistent behavior can also be observed, similar to that of the centralized method but not as obvious. When the number of robots is very small to medium (Figure 9a-c), the ACO method produces a similar expiry rate to those of the hotspot and blind methods in more than two-thirds of the trials while producing lower expiry rates in one-third of the trials. When the number of robots is large to very large (Figure 9d,e), however, the ACO method behaves in exactly the opposite fashion, producing expiry rates similar to those of the hotspot and blind methods in more than two-thirds of the trials while producing higher expiry rates in one-third of the trials. This means that its results are not consistent, and the expiry rate is greatly affected by the number of robots.
Meanwhile, the expiry rates of the hotspot and blind methods are notably very similar in most cases, and the effect of the number of robots is minimal in comparison to the effect on the other methods. It is observable in all setups (Figure 9d,e) that, regardless of the number of robots used, the expiry rate generally ranges between 20% and 70%. The performance is also not significantly reduced in certain setups but is significantly increased in others. Hence, blindly allocating robots or allocating them based on infection hotspots reduces the effect of inter-robot collisions and utilizes the available resources to achieve a more stable performance than that of the other methods.
From the result distribution (Figure 9f), it is very clear that the centralized method's expiry rate result is highly inconsistent. In the 24-robot experiment, it has a significantly lower minimum-maximum-mean expiry rate value, while in the 592-robot experiment, it notably has the highest minimum-maximum-mean expiry rate value. Similarly, in regard to the ACO method, its results are more dispersed than those of the hotspot and blind methods when the number of robots is 24 and 144; its results are similar to theirs when the number of robots is 112 and 592, and its results are more compressed when the number of robots is 136.
Furthermore, the ACO method results are sometimes positively skewed, while at other times, they are negatively skewed. This indicates that the ACO method is highly inconsistent and is not as scalable as the hotspot and blind methods when the ED_Foraging approach is adopted; the ACO method results in the best outcomes when the number of robots is intermediate (136) and the map size is medium (16 × 16).
Regarding the hotspot and blind methods, although the detailed results indicate that they have similar outcomes, when their distributions are compared (Figure 9f), the hotspot method results are more compressed than those of the blind method, particularly in the 24-, 136-, and 592-robot experiments. The hotspot method is slightly less compressed than the blind method in the 112-robot experiment, but it is positively skewed, indicating that most of its values are on the lower end of the scale.  Meanwhile, the expiry rates of the hotspot and blind methods are notably very similar in most cases, and the effect of the number of robots is minimal in comparison to the effect on the other methods. It is observable in all setups (Figure 9d,e) that, regardless of the number of robots used, the expiry rate generally ranges between 20% and 70%. The performance is also not significantly reduced in certain setups but is significantly increased in others. Hence, blindly allocating robots or allocating them based on infection hotspots reduces the effect of inter-robot collisions and utilizes the available resources to achieve a more stable performance than that of the other methods.
From the result distribution (Figure 9f), it is very clear that the centralized method's expiry rate result is highly inconsistent. In the 24-robot experiment, it has a significantly lower minimum-maximum-mean expiry rate value, while in the 592-robot experiment, it

Robustness
This section analyzes to what extent the robots are robust to different map sizes when the other components of the environment are held fixed. This is of particular importance because increasing the map size means that the robot travel time will increase and that meeting the deadlines might therefore be more difficult.
Starting with the average foraging rate (Figure 10a), as in the flexibility experiment, the ACO, hotspot, and blind methods have similar foraging rates in all map sizes. The average foraging rate (Figure 10a) of the centralized method, however, is worse than those of the other methods are when the map size is small, medium, and large, although it is better when the map size is very small or very large. This is due to two factors. First, having one collection zone in a very small or very large map when the number of robots is medium can decrease the travel time and hence increase the foraging rate. Second, the foraging rates of the ACO, hotspot, and blind methods in this figure show the average number of resources found in the central collection zone only (F) and not all collection zones (∑F d ).
Thus, the small difference between the foraging rates of the centralized method and the other methods can be attributed to the fact that the other methods allocate robots to move resources to the closest collection zone before they die.
Starting with the average foraging rate (Figure 10a), as in the flexibility experiment, the ACO, hotspot, and blind methods have similar foraging rates in all map sizes. The average foraging rate (Figure 10a) of the centralized method, however, is worse than those of the other methods are when the map size is small, medium, and large, although it is better when the map size is very small or very large. This is due to two factors. First, having one collection zone in a very small or very large map when the number of robots is medium can decrease the travel time and hence increase the foraging rate. Second, the foraging rates of the ACO, hotspot, and blind methods in this figure show the average number of resources found in the central collection zone only (F) and not all collection zones (∑Fd). Thus, the small difference between the foraging rates of the centralized method and the other methods can be attributed to the fact that the other methods allocate robots to move resources to the closest collection zone before they die. In regard to the average expiry rate (Figure 10b), the centralized method has the worst results in all map sizes except the 16 × 16 map, where it produces the lowest average In regard to the average expiry rate (Figure 10b), the centralized method has the worst results in all map sizes except the 16 × 16 map, where it produces the lowest average expiry rate. This behavior will be further investigated when discussing the detailed expiry rates shown in Figure 11. Going back to Figure 10b, in a very small map (4 × 4) and a very large map (40 × 40) with a medium number of robots, hotspot and blind DTA are better than the ACO method. In a small map (8 × 8), however, all three methods perform similarly. In a medium map (16 × 16), the ACO and hotspot methods have similar robustness, while the blind method performs worse than the rest. In a large map (20 × 20), the ACO and blind methods perform similarly, while the hotspot method is slightly worse. Therefore, it can be concluded that the hotspot method is either more robust than the other methods or is similar to the best method, except in the 20 × 20 map.
To further investigate the differences among the methods, especially in the case of the 16 × 16 and 20 × 20 maps, the detailed expiry rate of each trial is presented in Figure 11a (Figure 11c), the centralized method produces very inconsistent results from one trial to another.
As seen from the yellow line in Figure 11c, the expiry rate is significantly lower in half of the trials and significantly higher in the other half. This indicates that even though the average expiry rate of the centralized method is low, this result is not guaranteed, and it is not robust in many cases.
The ACO, hotspot, and blind methods have similar expiry rates in the 8 × 8 map (Figure 11b), 16 × 16 map (Figure 11c), and 40 × 40 map (Figure 11e). However, in the 4 × 4 map (Figure 11a), the ACO method produces a worse expiry rate than the hotspot and blind methods in most trials. This indicates that ACO is not robust to very small maps with a medium number of robots. Moreover, in the 20 × 20 map (Figure 11d), the hotspot method produces worse expiry rates than the ACO and blind methods in half of the trials. However, it is interesting that the hotspot method results are more consistent than the results of the other methods. Its results are 40 to 55% in all trials, while the ACO and blind results are 20 to 60%. The ACO and blind methods produce low expiry rates in half of the trials while showing higher expiry rates in other trials. This will be further clarified in the box plot visualization. large map (40 × 40) with a medium number of robots, hotspot and blind DTA are better than the ACO method. In a small map (8 × 8), however, all three methods perform similarly. In a medium map (16 × 16), the ACO and hotspot methods have similar robustness, while the blind method performs worse than the rest. In a large map (20 × 20), the ACO and blind methods perform similarly, while the hotspot method is slightly worse. Therefore, it can be concluded that the hotspot method is either more robust than the other methods or is similar to the best method, except in the 20 × 20 map.   Based on the expiry rate distribution shown in the box plot (Figure 11f), it can be confirmed that the centralized method is not robust to changes in the map size. Other than in the 16 × 16 map, the distribution of the centralized method is the highest and most dispersed, indicating that it has a significantly worse expiry rate than the other methods. In the 16 × 16 map, the centralized method results extend from 0% to 70% and are highly dispersed. This shows that even though the average expiry rate (Figure 10b) of the centralized method is the lowest in the 16 × 16 map, from the distribution shown in Figure 11f, this result is not guaranteed, and the expiry rate is highly inconsistent from one trial to another.
In regard to the ACO, hotspot, and blind distributions (Figure 11f), the behavior differs from one map to another. However, the hotspot method behaves consistently in all cases, preserving a similar rank in all map sizes in comparison to the other methods. In the 40 × 40 map, the ACO method produces the worst expiry rate distribution in comparison to the hotspot and blind methods, and it is highly dispersed and has the maximum values.
The hotspot and blind methods, on the other hand, produce similar expiry rates in this map size. When increasing the size to 8 × 8, the blind method has the worst expiry rate distribution, with the hotspot method still ranking in the middle, while the ACO method has the best distribution. In the 16 × 16 map, however, the blind method produces the best distribution, while the ACO and hotspot methods have similar distributions.
In the 20 × 20 map, the blind method has the lowest minimum value but the most dispersed result, indicating that its robustness is not consistent. The ACO method is second in regard to the minimum value, but it is negatively skewed, and its results are more dispersed than those of the hotspot method. With the hotspot method, although its average expiry rate (Figure 10b) is slightly higher than those of the ACO and blind methods in the 20 × 20 map, the distribution shows that its maximum value is lower than that of the other methods and that its results are more compressed than the other methods while also being positively skewed. This indicates that the hotspot method performance in this map is more consistent, i.e., robust, and produces results on the lower end of the scale in comparison to the other methods.
In a very large map (40 × 40), the ACO, hotspot, and blind methods have the same maximum and mean expiry rates but different minimums and dispersions. The results for the minimum expiry rate and dispersion of the hotspot method fall between the results of the ACO and blind methods. On the other hand, the blind method has the lowest expiry rate value, while ACO has the most compressed results. Therefore, from all the results of the box plot distribution (Figure 11f), it can be concluded that the hotspot method performs consistently, with an intermediate robustness in all map sizes, unlike the other methods, which offer the best results in some maps but the worst results in others.

Discussion
ED_Foraging is a novel foraging approach that is proposed to achieve self-organized allocation with dynamic deadlines. A swarm of robots searches an unknown environment to collect and transport constrained resources to a known nest location. The dynamic deadlines of the resources are modeled using the DD_Model, which is developed to predict the state of each resource while the robots search blindly and have no prior knowledge about the resources' locations and states. The proposed hotspot DTA method was able to prioritize critical regions. The intensity of the hotspot regions was adaptively updated, and thus, the allocation was dynamic and depended on the current state of the environment and the DD_Model.
When it comes to the computational complexity of the ED_Foraging approach, two main factors affect the performance: the number of resources N g and number of hotspots H. The navigation continues as long as there is a resource not collected; i.e., O(N). During this navigation, the approach is to search the hotspots, update the model, and update the hotspots density; i. This result was concluded because the maximum number of hotspots can become equal to the number of resources but cannot exceed it. Thus, the number of resources needed to be collected effects the approach performance. This means that applying the algorithm on real or simulated data has no effect on the complexity, but it is rather affected by the size of the targeted resources. Hence, the method can be suitable for real robots the same way it was suitable for the simulated one.
When it comes to the experimental results, starting with the flexibility of ED_Foraging, it was confirmed that hotspot DTA produced results that are more consistent. It was significantly better that the rest of the algorithms with a mid-sized map while similar to the best one in other map sizes. The blind and ACO DTA methods had inconsistent expiry rates since their results dispersed and became worse as the map size increased. Moreover, the ACO method results became negatively skewed with a very large map setup. The hotspot DTA method, however, obtained results that were either better than those of the other methods (in very small-to-small setups) or similar to the best results (for larger setup sizes). This indicates that it adapts better to changes in the environment.
In regards to scalability, the foraging rate was affected more by the number of collection zones than by the DTA method. This is normal because increasing the number of robots while retaining one collection zone naturally increases the number of inter-robot collisions and thus reduces the foraging rate. Nevertheless, this was not the case when comparing the expiry rates. The centralized and ACO DTA methods were highly inconsistent, as they reduced the expiry rate with a small number of robots while increasing it significantly with a large number of robots. They were only suitable in the case where the number of robots was proportional to the map size (an intermediate number of robots and a medium map size). The hotspot and blind DTA methods, on the other hand, were more stable and performed more consistently than the other two methods. In fact, the hotspot DTA method was significantly better that the rest of the algorithms with a large number of robots, while similar to the best in other sizes.
From the robustness experiments, it was found that the hotspot DTA result was usually in the middle compared to the rest of the methods. Meanwhile, centralized DTA was the worst in all cases, while the ACO and blind results were interwoven, such that one was the best in certain cases but the worst in other cases. Hence, the hotspot method tries to maintain a similar performance regardless of the map size, even when the resources are reduced. Its results were consistent and similar to the best one at each case.
Ultimately, although the proposed approach showed a similar foraging rate to the literature, the significance became clearer when comparing the detailed expiry rate. From the results discussed before, it can be concluded that the merits of the proposed approach resides in producing flexible, scalable, and robust results when comparing the expiry rate. Unlike the other methods, in which they might excel in one setup but not the rest, hotspot DTA showed a consistent performance and produced better results in each case, depending on the setup. On the other hand, the ACO method did not perform well with small maps or when there was a large number of robots, the blind method did not perform well with large maps, and the centralized method did not perform well in most cases, even when comparing the foraging rate. This conclusion indicates that the proposed approach can be applied to different real-life application in more than one scenario, as will be discussed next.

Real-Life Applications
The model proposed to predict the states of resources and their deadlines was adopted for epidemiological modeling and, thus, can be mapped to different types of real-life problems in which any epidemiological mathematical model can be adopted. Foraging with online deadlines can also be scaled to a large spectrum of real-life problems, as long as it can be mapped to biological foraging behavior. Khaluf [18] discussed different possible applications of this, such as cleaning pollution that needs to be collected at a specific time, operating factory-recycling systems, or constructing data mules that are used for collecting wireless sensor nodes' stored data. All these applications can also be undertaken using ED_Foraging.
One example of real-life application, in which the proposed approach can be used, is for a COVID-19 search-and-rescue mission. This mission can be about searching for infected individuals and predicting possible exposure around these individuals to collect the victims before they die. Multiple robots can be used to collaboratively accomplish the task, which is about moving objects from one place to another. Collected objects (resources) can represent individuals (or patients), and clinics are represented by the collection zones. In general, the task is about searching for patients, and when there is a suspicion of COVID-19 infection (captured by temperature sensors for example), individuals are moved to checkup points in order to be tested if they are positively infected. When the patients result is positive, they are moved to a general hospital to receive treatment, which is represented by the central collection zone. Finally, when it comes to the deadlines, it is important to move infected people before they die. Hence, the infection status can be mapped to the time constraint model, which can be updated online using the proposed DD_Model. The ultimate objective is to move as many patients as possible, while considering the time constraint to give priority to individuals that are more critical. Hence, the efficiency of the model is not only concerned with the number of resources it collects (foraging rate) but also about the death probability (expiry rate). Having the foraging and expiry rate are very important metrics to give an intuition about how well the system perform and to what extent the task is accomplished.
From all of the above, it can be concluded that any search-and-rescue mission with online deadlines can also be carried out using the proposed method. All that is required is to decide on the sensors that will be used to identify the resources and update the DD_Model parameters to model the task requirements. The ultimate objective is to prioritize critical individuals rather than to collect a large number of resources.

Conclusions
In the ED_Foraging approach, a swarm of robots searches an unknown environment to collect and transport resources with dynamic deadlines to a known nest location. The main contribution of this approach is to address the modeling and DTA problems highlighted in the literature on swarm foraging that have dynamic deadlines. The dynamic deadlines of resources and their states were modeled using the DD_Model, which was developed to predict the state of each resource, although the robots searched blindly and had no prior knowledge about the resources' locations and states. The robots collected resource information and updated the DD_Model when reporting to the closest zone. Then, this shared information was used by nearby robots to continuously update their models and, hence, give priority to denser and more intense regions. Thus, it was possible to deal with dynamic and online deadlines while searching.
In regard to DTA, the proposed hotspot DTA method was able to prioritize critical regions. The intensities of the hotspot regions were continuously updated, and thus, the allocation was dynamic, depending on the current state of the environment and the DD_Model. The main objective was to reduce the number of expired deadlines by observing the environment and updating its density and intensity. From the experimental results, it was possible to test the proposed approach with different DTA methods, and it was found that using multiple collection zones was more suitable for the proposed approach in terms of both the foraging and expiry rates. Although communication was limited, the use of the DD_Model and hotspot intensity gave higher priority to infected resources; the results showed that hotspot DTA performed better when comparing different map setups. In regards to scalability and robustness, it was found that the proposed method produced consistent results, as it tried to maintain a similar performance across changing environments. Thus, unlike the other methods in the literature, it produces flexible, scalable, and robust results.
One important observation that needs to be addressed in the future is that the expiry rate can be relatively high in the experiments. This is because the search is blind and the communication among robots is limited. Thus, a future research direction will focus on the method of selecting the model parameters adaptively, in which the parameters of the DD_Model are automatically updated based on a real environment setup. In addition, ED_Foraging has many potential applications due to its inherent flexibility, scalability, and robustness. Therefore, future work will also involve modeling real-life problems, such as rescuing COVID-19 patients or operating a recycling factory, and testing them with real swarms of robots.