Abstract
Multi-robot rescue path planning (MRRPP) is critical for ensuring the rapid and effective completion of post-disaster rescue tasks. Most studies focus on minimizing the length of rescue paths, the number of robots, and rescue time, neglecting the task utility, which reflects the effect of timely emergency supplies delivery, which is also important for post-disaster rescue. In this study, we integrated multiple optimization indicators into the rescue cost and modeled the problem as a variant of the vehicle routing problem (VRP) with timeliness and battery constraints. A population-based iterative greedy algorithm with Q-learning (QPIG) is proposed to solve it. First, two problem-specific heuristic schemes are designed to generate a high-quality and diverse population. Second, a competition-oriented destruction-reconstruction mechanism is applied to improve the global search ability of the algorithm. In addition, a Q-learning-based local search strategy is developed to enhance the algorithm’s exploitation ability. Moreover, a historical information-based constructive strategy is investigated to accelerate the convergence speed of the algorithm. Finally, the proposed QPIG is validated by comparing it with five efficient algorithms on 56 instances. Experiment results show that the proposed QPIG significantly outperforms compared algorithms in terms of rescue cost and convergence speed.