Abstract
In the context of Industry 4.0 and intelligent manufacturing, conventional serial manipulators face limitations in dynamic task environments due to fixed structural parameters and the traditional decoupling of mechanism design from motion planning. To address this issue, this study proposes SAC-SC (Soft Actor–Critic-based Structure–Control Co-Design), a reinforcement learning framework for the co-design of manipulator link lengths and motion planning policies. The approach is implemented on a custom four-degree-of-freedom PRRR manipulator with manually adjustable link lengths, where a hybrid action space integrates configuration selection at the beginning of each episode with subsequent continuous joint-level control, guided by a multi-objective reward function that balances task accuracy, execution efficiency, and obstacle avoidance. Evaluated in both a simplified kinematic simulator and the high-fidelity MuJoCo physics engine, SAC-SC achieves 100% task success rate in obstacle-free scenarios and 85% in cluttered environments, with a planning time of only 0.145 s per task, over 15 times faster than the two-stage baseline. The learned policy also demonstrates zero-shot transfer between simulation environments. These results indicate that integrating structural parameter optimization and motion planning within a unified reinforcement learning framework enables more adaptive and efficient robotic operation in unstructured environments, offering a promising alternative to conventional decoupled design paradigms.