AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic

Gao, Qiuqi; Li, Jiahong; Huang, Xiaoxiang; Zhu, Yidian; Du, Yu

doi:10.3390/app16083687

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic

by

Qiuqi Gao

^1,2

,

Jiahong Li

^1,2,

Xiaoxiang Huang

^1,2,

Yidian Zhu

^1,2 and

Yu Du

^1,2,*

¹

Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China

²

School of Robotics, Beijing Union University, Beijing 100101, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(8), 3687; https://doi.org/10.3390/app16083687

Submission received: 10 February 2026 / Revised: 2 April 2026 / Accepted: 7 April 2026 / Published: 9 April 2026

Download Versions Notes

Abstract

Highway ramp merging requires autonomous vehicles to make safe and efficient decisions in dense mixed traffic, where strong vehicle interactions and rapidly changing acceptable gaps make the task particularly challenging. Existing reinforcement learning methods are often unimodal and overly conservative, while diffusion-based policies, despite their ability to generate multimodal actions, usually suffer from high inference latency and safety risks caused by unconstrained sampling. To address these issues, this paper proposes AM-DIMPO, an action-mask-guided safe diffusion-implicit policy optimization framework for ramp-merging tasks. The proposed method combines DDIM-based implicit sampling with a state-dependent continuous action mask to improve multimodal action generation efficiency while enhancing action feasibility. In addition, the mask correction signal is incorporated into policy learning to encourage the policy to generate actions closer to the safe feasible region. Experiments are conducted in a Gym-based ramp-merging simulator under both light-traffic and dense-traffic scenarios, where the proposed method is compared with classical reinforcement learning baselines, diffusion reinforcement learning baselines, and a safety-aware PPO baseline. The results show that, in dense traffic, AM-DIMPO achieves a merging success rate of 97.3%, an average speed of 16.27 m/s, and an inference latency of 68 ms; in light traffic, the success rate reaches 98.1%. Moreover, the proposed method maintains robust performance under the tested noisy-observation and reduced-visibility settings. Overall, AM-DIMPO achieves a favorable balance among empirical safety, traffic efficiency, robustness, and real-time inference performance in dense highway ramp-merging tasks.

Keywords: reinforcement learning; diffusion-implicit policy; autonomous driving; on-ramp merging; action mask

Share and Cite

MDPI and ACS Style

Gao, Q.; Li, J.; Huang, X.; Zhu, Y.; Du, Y. AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic. Appl. Sci. 2026, 16, 3687. https://doi.org/10.3390/app16083687

AMA Style

Gao Q, Li J, Huang X, Zhu Y, Du Y. AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic. Applied Sciences. 2026; 16(8):3687. https://doi.org/10.3390/app16083687

Chicago/Turabian Style

Gao, Qiuqi, Jiahong Li, Xiaoxiang Huang, Yidian Zhu, and Yu Du. 2026. "AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic" Applied Sciences 16, no. 8: 3687. https://doi.org/10.3390/app16083687

APA Style

Gao, Q., Li, J., Huang, X., Zhu, Y., & Du, Y. (2026). AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic. Applied Sciences, 16(8), 3687. https://doi.org/10.3390/app16083687

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AM-DIMPO: Action-Masked Diffusion-Implicit Policy Optimization for On-Ramp Merging Under Dense Traffic

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI