Co-Design of Structural Parameters and Motion Planning in Serial Manipulators via SAC-Based Reinforcement Learning

Yifan Zhu; Jinfei Liu; Hua Huang; Ming Chen; Jindong Qu

doi:10.3390/machines14020158

,

and

¹

School of Mechanical Engineering and Robotics, Tongji University, Shanghai 201804, China

²

Sino-German College of Applied Sciences, Tongji University, Shanghai 201804, China

³

Engineering Practice Center of Tongji University, Shanghai 201804, China

^*

Author to whom correspondence should be addressed.

Machines2026, 14(2), 158;https://doi.org/10.3390/machines14020158

This article belongs to the Section Machine Design and Theory

Version Notes

Order Reprints

Abstract

In the context of Industry 4.0 and intelligent manufacturing, conventional serial manipulators face limitations in dynamic task environments due to fixed structural parameters and the traditional decoupling of mechanism design from motion planning. To address this issue, this study proposes SAC-SC (Soft Actor–Critic-based Structure–Control Co-Design), a reinforcement learning framework for the co-design of manipulator link lengths and motion planning policies. The approach is implemented on a custom four-degree-of-freedom PRRR manipulator with manually adjustable link lengths, where a hybrid action space integrates configuration selection at the beginning of each episode with subsequent continuous joint-level control, guided by a multi-objective reward function that balances task accuracy, execution efficiency, and obstacle avoidance. Evaluated in both a simplified kinematic simulator and the high-fidelity MuJoCo physics engine, SAC-SC achieves 100% task success rate in obstacle-free scenarios and 85% in cluttered environments, with a planning time of only 0.145 s per task, over 15 times faster than the two-stage baseline. The learned policy also demonstrates zero-shot transfer between simulation environments. These results indicate that integrating structural parameter optimization and motion planning within a unified reinforcement learning framework enables more adaptive and efficient robotic operation in unstructured environments, offering a promising alternative to conventional decoupled design paradigms.

Keywords:

co-design; Soft Actor–Critic (SAC); link length optimization; motion planning

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.