Abstract
The rapid proliferation of connected and autonomous vehicles in the 6G era demands ultra-reliable and low-latency computation with intelligent resource coordination. Unmanned Aerial Vehicle (UAV)-assisted Mobile Edge Computing (MEC) provides a flexible and scalable solution to extend coverage and enhance offloading efficiency for dynamic Internet of Vehicles (IoV) environments. However, jointly optimizing task latency, user fairness, and service priority under time-varying channel conditions remains a fundamental challenge.To address this issue, this paper proposes a novel Multi-Agent Priority-based Fairness Adaptive Delayed Deep Deterministic Policy Gradient (MA-PF-AD3PG) algorithm for UAV-assisted MEC systems. An occlusion-aware dynamic deadline model is first established to capture real-time link blockage and channel fading. Based on this model, a priority–fairness coupled optimization framework is formulated to jointly minimize overall latency and balance service fairness across heterogeneous vehicular tasks. To efficiently solve this NP-hard problem, the proposed MA-PF-AD3PG integrates fairness-aware service preprocessing and an adaptive delayed update mechanism within a multi-agent deep reinforcement learning structure, enabling decentralized yet coordinated UAV decision-making. Extensive simulations demonstrate that MA-PF-AD3PG achieves superior convergence stability, 13–57% higher total rewards, up to 46% lower delay, and nearly perfect fairness compared with state-of-the-art Deep Reinforcement Learning (DRL) and heuristic methods.