Abstract
Cloud-native microservice architectures offer scalability and resilience but introduce complex interdependencies and new attack surfaces, making them vulnerable to resource-exhaustion Distributed Denial-of-Service (DDoS) attacks. These attacks propagate along service call chains, closely mimic legitimate traffic, and evade traditional detection and mitigation techniques, resulting in cascading bottlenecks and degraded Quality of Service (QoS). Existing Moving Target Defense (MTD) approaches lack adaptive, cost-aware policy guidance and are often ineffective against spatiotemporally adaptive adversaries. To address these challenges, this paper proposes ScaleShield, an adaptive MTD framework powered by Deep Reinforcement Learning (DRL) that learns coordinated, attack-aware defense policies for microservices. ScaleShield formulates defense as a Markov Decision Process (MDP) over multi-dimensional discrete actions, leveraging a Multi-Dimensional Double Deep Q-Network (MD3QN) to optimize service availability and minimize operational overhead. Experimental results demonstrate that ScaleShield achieves near 100% defense success rates and reduces compromised nodes to zero within approximately 5 steps, significantly outperforming state-of-the-art baselines. It lowers service latency by up to 72% under dynamic attacks while maintaining over 94% resource efficiency, providing robust and cost-effective protection against resource-exhaustion DDoS attacks in cloud-native environments.