Adaptive Action Chunking for Robotic Imitation Learning
Abstract
1. Introduction
- We propose a new paradigm of adaptive action chunking. This work is the first to unify long-horizon action generation and dynamic planning-horizon decision-making within an end-to-end imitation learning framework, fundamentally overcoming the inherent limitations of fixed-chunk strategies.
- We design an efficient dual-branch network architecture. The synergistic design of shared encoding and parallel prediction enables the joint optimization of action and chunk length prediction. Comprehensive ablation studies validate the core value of the adaptive module.
- We provide systematic empirical validation. Through comprehensive quantitative and qualitative evidence on two complementary complex manipulation tasks, we not only demonstrate the superiority of the method but also reveal its general capability to derive distinct intelligent strategies based on the uncertainty structure of tasks.
2. Related Works
2.1. Imitation Learning Frameworks
2.2. Action Chunking and Sequence Modeling
2.3. Adaptive and Hierarchical Decision-Making Methods
3. Methods
3.1. Problem Formulation
3.2. Algorithm Overview
3.3. Network Architecture
3.3.1. Shared Encoder Design
3.3.2. Action Prediction Head Design
- A linear layer that expands the feature dimension from 768 to 512.
- A Rectified Linear Unit (ReLU) activation function that is applied element-wise to introduce non-linearity.
- A second linear layer that maps the 512-dimensional feature to a 256-dimensional space.
- Another ReLU activation function.
- A final linear output layer that projects the 256-dimensional vector to the target output dimension.
- is the pre-defined maximum allowed chunk length.
- is the dimensionality of the robot’s action space. For our bimanual system, the action space dimensionality is A = 14, comprising 6D end-effector pose increments (, , , , , ) for each arm and a 1D gripper open/close command for each gripper.
3.3.3. Chunk Size Prediction Head Design
- A linear layer that maps the input feature dimension from 768 to 256.
- A Rectified Linear Unit (ReLU) activation function that is applied element-wise.
- final linear output layer that projects the 256-dimensional feature to a logits vector of size , where represents the number of all possible discrete chunk length values within the predefined operational range [, ].
3.3.4. Gating and Selection Mechanism
- The full action sequence from the Action Prediction Head.
- The dynamically predicted optimal chunk length from the Chunk Size Prediction Head, where .
3.4. Training Objective and Details
| Algorithm 1: Adaptive Action Chunking Policy |
| Input: Set: Trained policy parameters ; allowable chunk set ; max steps . |
| Initialize: Obtain initial observation , set . |
| While task not successful and do |
| Encode: Encode current multi-view obs. : . |
Predict in Parallel: Process with two heads:
|
| Action Selection: Slice the first actions from : . |
| Open-loop Execution: Execute the actions on the robot without replanning. |
| Update: . Obtain new observation after execution. |
| End While |
4. Experiments
4.1. Experimental Setup
4.1.1. Hardware Platform
4.1.2. Baseline Methods
4.1.3. Evaluation Metrics
4.1.4. Training Details
4.1.5. Generalization Validation Scheme
4.2. Experiment 1: Bimanual Transport-And-Place Task
4.2.1. Task Description
4.2.2. Quantitative Results and Analysis
4.2.3. Quantitative Results and Comparison
4.3. Experiment 2: Bimanual Alternating Flip-And-Handover Task
4.3.1. Task Description
4.3.2. Quantitative Results and Analysis
4.3.3. Quantitative Results and Comparison
4.4. Analysis and Discussion
4.4.1. Summary of Core Findings
4.4.2. Multi-Modality of the Adaptive Strategy
4.4.3. Limitations and Future Work
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wake, N.; Kanehira, A.; Sasabuchi, K.; Takamatsu, J.; Ikeuchi, K. Gpt-4v (Ision) for Robotics: Multimodal Task Planning from Human Demonstration. IEEE Robot. Autom. Lett. 2024, 9, 10567–10574. [Google Scholar] [CrossRef]
- Chi, C.; Xu, Z.; Feng, S.; Cousineau, E.; Du, Y.; Burchfiel, B.; Tedrake, R.; Song, S. Diffusion Policy: Visuomotor Policy Learning via Action Diffusion. Int. J. Rob. Res. 2025, 44, 1684–1704. [Google Scholar] [CrossRef]
- Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep Learning for Visual Understanding: A Review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-Cam: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; IEEE: New York, NY, USA, 2017; pp. 618–626. [Google Scholar]
- Liu, X.; Wang, Q.; Hu, Y.; Tang, X.; Zhang, S.; Bai, S.; Bai, X. End-to-End Temporal Action Detection with Transformer. IEEE Trans. Image Process. 2022, 31, 5427–5441. [Google Scholar] [CrossRef] [PubMed]
- Kang, J.H.; Joshi, S.; Huang, R.; Gupta, S.K. Robotic Compliant Object Prying Using Diffusion Policy Guided by Vision and Force Observations. IEEE Robot. Autom. Lett. 2025, 20, 5505–5512. [Google Scholar] [CrossRef]
- Zhang, N.; Yan, J.; Hu, C.; Sun, Q.; Yang, L.; Gao, D.W.; Guerrero, J.M.; Li, Y. Price-Matching-Based Regional Energy Market with Hierarchical Reinforcement Learning Algorithm. IEEE Trans. Ind. Inform. 2024, 20, 11103–11114. [Google Scholar] [CrossRef]
- Hu, J.; Weng, P.; Ban, Y. State-Novelty Guided Action Persistence in Deep Reinforcement Learning. Mach. Learn. 2025, 114, 45. [Google Scholar] [CrossRef]
- Pang, T.; Suh, H.J.T.; Yang, L.; Tedrake, R. Global Planning for Contact-Rich Manipulation via Local Smoothing of Quasi-Dynamic Contact Models. IEEE Trans. Robot. 2023, 39, 4691–4711. [Google Scholar] [CrossRef]
- Pérez-Dattari, R.; Kober, J. Stable Motion Primitives via Imitation and Contrastive Learning. IEEE Trans. Robot. 2023, 39, 3909–3928. [Google Scholar] [CrossRef]
- Oh, J.-H.; Espinoza, I.; Jung, D.; Kim, T.-S. Bimanual Long-Horizon Manipulation Via Temporal-Context Transformer RL. IEEE Robot. Autom. Lett. 2024, 9, 10898–10905. [Google Scholar] [CrossRef]
- Jiang, R.; Cheng, X.; Sang, H.; Wang, Z.; Zhou, Y.; He, B. GTHSL: A Goal-Task-Driven Hierarchical Sharing Learning Method to Learn Long-Horizon Tasks Autonomously. IEEE Trans. Ind. Electron. 2024, 72, 3994–4005. [Google Scholar] [CrossRef]
- Dosovitskiy, A. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the 9th International Conference on Learning Representations, Vienna, Austria, 3–7 May 2021; 2020, pp. 1–22. [Google Scholar]
- Ingelhag, N.; Munkeby, J.; van Haastregt, J.; Varava, A.; Welle, M.C.; Kragic, D. A Robotic Skill Learning System Built upon Diffusion Policies and Foundation Models. In Proceedings of the 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN), Pasadena, CA, USA, 26–30 August 2024; IEEE: New York, NY, USA, 2024; pp. 748–754. [Google Scholar]
- Nazeer, M.S.; Laschi, C.; Falotico, E. Soft Dagger: Sample-Efficient Imitation Learning for Control of Soft Robots. Sensors 2023, 23, 8278. [Google Scholar] [CrossRef]
- Satish, V.; Mahler, J.; Goldberg, K. On-Policy Dataset Synthesis for Learning Robot Grasping Policies Using Fully Convolutional Deep Networks. IEEE Robot. Autom. Lett. 2019, 4, 1357–1364. [Google Scholar] [CrossRef]
- Kim, H.; Ohmura, Y.; Kuniyoshi, Y. Goal-Conditioned Dual-Action Imitation Learning for Dexterous Dual-Arm Robot Manipulation. IEEE Trans. Robot. 2024, 40, 2287–2305. [Google Scholar] [CrossRef]
- Zhao, M.; Shimosaka, M. Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction. IEEE Access 2025, 13, 87313–87326. [Google Scholar] [CrossRef]
- Hu, J.; Wang, F.; Li, X.; Qin, Y.; Guo, F.; Jiang, M. Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor–Critic and Generative Adversarial Imitation Learning. Biomimetics 2024, 9, 779. [Google Scholar] [CrossRef] [PubMed]
- Eppe, M.; Gumbsch, C.; Kerzel, M.; Nguyen, P.D.H.; Butz, M.V.; Wermter, S. Intelligent Problem-Solving as Integrated Hierarchical Reinforcement Learning. Nat. Mach. Intell. 2022, 4, 11–20. [Google Scholar] [CrossRef]
- Yang, Z.; Wang, L.; Cao, Z.; Zhang, Z.; Xu, Z. A Global Path-Planning Algorithm Based on Critical Point Diffusion Binary Tree for a Planar Mobile Robot. Bull. Pol. Acad. Sci. Tech. Sci. 2024, 72, 148834–148844. [Google Scholar] [CrossRef]
- Chen, W.; Wan, H.; Luan, X.; Liu, F. Self-triggered Control for Linear Systems Based on Hierarchical Reinforcement Learning. Int. J. Robust. Nonlinear Control 2024, 37, 9112–9129. [Google Scholar] [CrossRef]
- Chen, L.; Wang, Y.; Miao, Z.; Mo, Y.; Feng, M.; Zhou, Z.; Wang, H. Transformer-Based Imitative Reinforcement Learning for Multirobot Path Planning. IEEE Trans. Ind. Inform. 2023, 19, 10233–10243. [Google Scholar] [CrossRef]
- Zhang, X.; Liu, Y.; Chang, H.; Schramm, L.; Boularias, A. Autoregressive Action Sequence Learning for Robotic Manipulation. IEEE Robot. Autom. Lett. 2025, 10, 4898–4905. [Google Scholar] [CrossRef]
- Song, S.; Zeng, A.; Lee, J.; Funkhouser, T. Grasping in the Wild: Learning 6dof Closed-Loop Grasping from Low-Cost Demonstrations. IEEE Robot. Autom. Lett. 2020, 5, 4978–4985. [Google Scholar] [CrossRef]
- Chen, H.; Kim, M.-C.; Ko, Y.; Kim, C.-S. Compensated Motion and Position Estimation of a Cable-Driven Parallel Robot Based on Deep Reinforcement Learning. Int. J. Control Autom. Syst. 2023, 21, 3507–3518. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, X.; Feng, Z.; Xiao, X. Heterogeneous Multi-Robot Cooperation with Asynchronous Multi-Agent Reinforcement Learning. IEEE Robot. Autom. Lett. 2023, 9, 159–166. [Google Scholar] [CrossRef]
- Wu, J.; Liu, X.; Chen, S. Hyperparameter Optimization through Context-Based Meta-Reinforcement Learning with Task-Aware Representation. Knowl.-Based Syst. 2023, 260, 110160–110170. [Google Scholar] [CrossRef]
- Farzan, S.; Azimi, V.; Hu, A.-P.; Rogers, J. Adaptive Control of Wire-Borne Underactuated Brachiating Robots Using Control Lyapunov and Barrier Functions. IEEE Trans. Control Syst. Technol. 2022, 30, 2598–2614. [Google Scholar] [CrossRef]
- Tang, S.Y.; Irissappane, A.A.; Oliehoek, F.A.; Zhang, J. Teacher-Apprentices RL (TARL): Leveraging Complex Policy Distribution through Generative Adversarial Hypernetwork in Reinforcement Learning. Auton. Agent. Multi. Agent. Syst. 2023, 37, 25–49. [Google Scholar] [CrossRef]
- Liu, X.; Wu, J.; Chen, S. Efficient Hyperparameters Optimization through Model-Based Reinforcement Learning with Experience Exploiting and Meta-Learning. Soft Comput. 2023, 27, 8661–8678. [Google Scholar] [CrossRef]
- Huang, Y.; Li, W.; Zhang, X.; Li, J.; Li, Y.; Sun, Y.; Chiu, P.W.Y.; Li, Z. 4-DOF Visual Servoing of a Robotic Flexible Endoscope with a Predefined-Time Convergent and Noise-Immune Adaptive Neural Network. IEEE/ASME Trans. Mechatron. 2023, 29, 576–587. [Google Scholar] [CrossRef]
- Cui, Z.; Huang, Y.; Li, W.; Chiu, P.W.Y.; Li, Z. Noise-Resistant Adaptive Gain Recurrent Neural Network for Visual Tracking of Redundant Flexible Endoscope Robot with Time-Varying State Variable Constraints. IEEE Trans. Ind. Electron. 2023, 71, 2694–2704. [Google Scholar] [CrossRef]
- Li, J.; Huang, Y.; Zhang, X.; Xie, K.; Xian, Y.; Luo, X.; Chiu, P.W.Y.; Li, Z. An Autonomous Surgical Instrument Tracking Framework with a Binocular Camera for a Robotic Flexible Laparoscope. IEEE Robot. Autom. Lett. 2023, 8, 4291–4298. [Google Scholar] [CrossRef]






| Approach | Core Framework | Chunk Length Type | Bimanual Manipulation Adaptation |
|---|---|---|---|
| ACT [5] | Action Chunking Transformer | Fixed (predefined) | not support |
| Diffusion-based ACT [21] | Action Chunking + Diffusion Model | Fixed (predefined) | not support |
| Hierarchical Skill Learning [27] | Hierarchical Planning + Skill Library | Fixed (skill-specific) | support |
| Meta-learning-based Adaptation [31] | Meta-learning + Parameter Adaptation | no action chunking | not support |
| Autoregressive Action Learning [24] | Autoregressive Sequence Prediction | Fixed (predefined) | not support |
| Proposed Method (Ours) | Adaptive Action Chunking + Dual-branch Network | Dynamic (context-aware) | support |
| Method | Success Rate (%) | Average Completion Time (s) |
|---|---|---|
| 35 | 38.5 ± 4.2 | |
| 70 | 34.8 ± 3.1 | |
| 50 | 37.2 ± 3.8 | |
| Ours-Frozen () | 80 | 35.5 ± 2.9 |
| Ours (Adaptive) | 100 | 32.2 ± 2.3 |
| Method | Success Rate (%) | Average Completion Time (s) |
|---|---|---|
| 15 | 48.5 ± 7.2 | |
| 25 | 48.1 ± 9.5 | |
| 20 | 51.5 ± 10.3 | |
| Ours-Frozen () | 40 | 45.5 ± 7.9 |
| Ours (Adaptive) | 90 | 50.6 ± 6.2 |
| Test Scenario | Proposed Method (Success Rate) | Fixed Chunk (K = 10, Success Rate) | Fixed Chunk (K = 30, Success Rate) |
|---|---|---|---|
| Normal Scenario (No Interference) | 100% | 35% | 70% |
| Object Position/ Orientation Perturbation | 100% | 35% | 60% |
| Visual Noise Interference | 95% | 30% | 50% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wen, Q.; Zhu, H.; Zhang, Y.; Xia, L.; Gao, B.; Li, Z. Adaptive Action Chunking for Robotic Imitation Learning. Biomimetics 2026, 11, 316. https://doi.org/10.3390/biomimetics11050316
Wen Q, Zhu H, Zhang Y, Xia L, Gao B, Li Z. Adaptive Action Chunking for Robotic Imitation Learning. Biomimetics. 2026; 11(5):316. https://doi.org/10.3390/biomimetics11050316
Chicago/Turabian StyleWen, Qingpeng, Haomin Zhu, Yuepeng Zhang, Linzhong Xia, Bo Gao, and Zhuozhen Li. 2026. "Adaptive Action Chunking for Robotic Imitation Learning" Biomimetics 11, no. 5: 316. https://doi.org/10.3390/biomimetics11050316
APA StyleWen, Q., Zhu, H., Zhang, Y., Xia, L., Gao, B., & Li, Z. (2026). Adaptive Action Chunking for Robotic Imitation Learning. Biomimetics, 11(5), 316. https://doi.org/10.3390/biomimetics11050316

