Reinforcement Learning-Enabled Control and Design of Rigid-Link Robotic Fish: A Comprehensive Review
Abstract
1. Introduction
2. Review Methodology
2.1. Research Methods
2.2. Inclusion and Exclusion Criteria
2.3. Study Selection and Screening
3. Design of Rigid Links Fish Robots
3.1. Single-Joint Structure
3.2. Two-Joint Structure
3.3. Three-Joint Structure
3.4. Multiple-Joint Structure
4. Reinforcement Learning-Based Control in RLFRs
4.1. Q-Learning Algorithms
4.2. Deep Q-Network Algorithms
- (1)
- A dynamic integrated reward that encourages efficient goal-directed movement while adapting to currents and avoiding collisions.
- (2)
- A two-step action-selection strategy that starts with Boltzmann exploration and gradually transitions to ε-greedy action selection.
- (3)
- A double-level dynamic learning rate that combines meta-gradient adjustment with Adam optimization for faster and more stable training.
4.3. Deep Deterministic Policy Gradient Algorithms
4.4. Reward Configuration Strategies and Design Challenges
5. Discussion
5.1. Challenges
5.1.1. Challenges in Design
5.1.2. Challenges of Reinforcement Learning
5.2. Future Directions
5.2.1. Future Directions for Physical Platforms and Designs
5.2.2. Future Directions for Control Strategies
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kopman, V.; Porfiri, M. Design, modeling, and characterization of a miniature robotic fish for research and education in biomimetics and bioinspiration. IEEE ASME Trans. Mechatron. 2013, 1047, 471–483. [Google Scholar] [CrossRef]
- Makrodimitris, M.; Aliprantis, I.; Papadopoulos, E. Design and implementation of a low cost, pump-based, depth control of a small robotic fish. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 1127–1132. [Google Scholar]
- Yu, J.; Chen, S.; Wu, Z.; Wang, W. On a miniature free-swimming robotic fish with multiple sensors. Int. J. Adv. Robot. Syst. 2016, 13, 62. [Google Scholar] [CrossRef]
- Yang, G.H.; Choi, W.; Lee, S.H.; Kim, K.S.; Lee, H.J.; Choi, H.S.; Ryuh, Y.S. Control and design of a 3 DOFfishrobot ‘ICHTUS’. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Karon Beach, Thailand, 7–11 December 2011; pp. 2108–2113. [Google Scholar]
- Marcheseandrew, D.; Onalcagdas, D.; Rus, D. Autonomous soft robotic fish capable of escape maneuvers using fluidic elastomer actuators. Soft Robot. 2014, 1. [Google Scholar] [CrossRef]
- Chen, D.; Wang, B.; Xiong, Y.; Zhang, J.; Tong, R.; Meng, Y.; Yu, J. Design and analysis of a novel bionic tensegrity robotic fish with a continuum body. Biomimetics 2024, 9, 19. [Google Scholar] [CrossRef]
- Wang, J.; Tan, X. A dynamic model for tail-actuated robotic fish with drag coefficient adaptation. Mechatronics 2013, 23, 659–668. [Google Scholar] [CrossRef]
- Wang, J.; Tan, X. Averaging tail-actuated robotic fish dynamics through force and moment scaling. IEEE Trans. Robot. 2015, 31, 906–917. [Google Scholar] [CrossRef]
- Liu, J.; Hu, H. Biological inspiration: From carangiform fish to multi-joint robotic fish. J. Bionic Eng. 2010, 7, 35–48. [Google Scholar] [CrossRef]
- Chen, D.; Wu, Z.; Dong, H.; Tan, M.; Yu, J. Exploration of swimming performance for a biomimetic multi-joint robotic fish with a compliant passive joint. Bioinspiration Biomim. 2020, 16, 026007. [Google Scholar]
- Wang, J.; Wu, Z.; Dong, H.; Tan, M.; Yu, J. Development and control of underwater gliding robots: A review. IEEE/CAA J. Autom. Sin. 2022, 9, 1543–1560. [Google Scholar] [CrossRef]
- Matthews, D.G.; Zhu, R.; Wang, J.; Dong, H.; Bart-Smith, H.; Lauder, G. Role of the Caudal Peduncle in a Fish-Inspired Robotic Model: How Changing Stiffness and Angle of Attack Affects Swimming Performance. Bioinspiration Biomim. 2022, 17, 066017. [Google Scholar] [CrossRef]
- Iguchi, K.; Shimooka, T.; Tanaka, H.; Ikemoto, Y.; Shintake, J. Agile robotic fish based on direct drive of continuum body. npj Robot. 2024, 2, 7. [Google Scholar] [CrossRef]
- Youssef, S.M.; Soliman, M.; Saleh, M.A.; Elsayed, A.H.; Radwan, A.G. Design and control of soft biomimetic pangasius fish robot using fin ray effect and reinforcement learning. Sci. Rep. 2022, 12, 21861. [Google Scholar] [CrossRef] [PubMed]
- Ma, S.; Zhao, Q.; Ding, M.; Zhang, M.; Zhao, L.; Huang, C.; Zhang, J.; Liang, X.; Yuan, J.; Wang, X.; et al. A Review of Robotic Fish Based on Smart Materials. Biomimetics 2023, 8, 227. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Xu, Y.; Wu, Z.; Ma, L.; Guo, M.; Li, Z.; Li, Y. A comprehensive review on fish-inspired robots. Int. J. Adv. Robot. Syst. 2022, 19, 172988062211037. [Google Scholar] [CrossRef]
- Yan, S.; Wu, Z.; Wang, J.; Feng, Y.; Yu, L.; Yu, J.; Tan, M. Recent advances in design, sensing, and autonomy of biomimetic robotic fish: A review. IEEE ASME Trans. Mechatron. 2025, 30, 3517–3536. [Google Scholar] [CrossRef]
- Raj, A.; Thakur, A. Fish-Inspired Robots: Design, Sensing, Actuation, and Autonomy-A Review of Research. Bioinspiration Biomim. 2016, 11, 031001. [Google Scholar] [CrossRef]
- Yan, S.; Wu, Z.; Wang, J.; Tan, M.; Yu, J. Efficient cooperative structured control for a multijoint biomimetic robotic fish. IEEE ASME Trans. Mechatron. 2021, 26, 2506–2516. [Google Scholar] [CrossRef]
- Lim, L.W.K. University Malaysia Sarawak Malaysia Implementation of artificial intelligence in aquaculture fisheries: Deep learning machine vision big data internet of things robots beyond. J. Comput. Cogn. Eng. 2023, 3, 112–118. [Google Scholar]
- Yang, X.; Zhang, S.; Liu, J.; Gao, Q.; Dong, S.; Zhou, C. Deep learning for smart fish farming: Applications, opportunities and challenges. Rev. Aquac. 2021, 13, 66–90. [Google Scholar] [CrossRef]
- Sun, B.; Li, W.; Wang, Z.; Zhu, Y.; He, Q.; Guan, X.; Dai, G.; Yuan, D.; Li, A.; Cui, W.; et al. Recent progress in modeling and control of bio-inspired fish robots. J. Mar. Sci. Eng. 2022, 10, 773. [Google Scholar] [CrossRef]
- Jung, H.; Park, S.; Joe, S.; Woo, S.; Choi, W.; Bae, W. AI-driven control strategies for biomimetic robotics: Trends, challenges, and future directions. Biomimetics 2025, 10, 460. [Google Scholar] [CrossRef] [PubMed]
- Banerjee, C.; Nguyen, K.; Fookes, C.; Raissi, M. A survey on physics informed reinforcement learning: Review and open problems. Expert Syst. Appl. 2025, 287, 128166. [Google Scholar] [CrossRef]
- Marras, S.; Porfiri, M. Fish and robots swimming together: Attraction towards the robot demands biomimetic locomotion. J. R. Soc. Interface 2012, 9, 1856–1868. [Google Scholar] [CrossRef] [PubMed]
- Fujiwara, S.; Yamaguchi, S. Development of fishlike robot that imitates carangiform and subcarangiform swimming motions. J. Aero Aqua Bio-Mech. 2017, 6, 1–8. [Google Scholar] [CrossRef]
- Clapham, R.J.; Hu, H. iSplash-II: Realizing fast carangiform swimming to outperform a real fish. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014. [Google Scholar]
- Zhang, R.; Zhou, W.; Li, M.; Li, M. Design of a double-joint robotic fish using a composite linkage. In Proceedings of the 2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM), Hangzhou, China, 8–11 August 2024; pp. 279–283. [Google Scholar]
- Liang, J.; Wang, T.; Wen, L. Development of a two-joint robotic fish for real-world exploration. J. Field Robot. 2011, 28, 70–79. [Google Scholar] [CrossRef]
- Kiebert, L.; Joordens, M. Autonomous robotic fish for a swarm environment. In Proceedings of the 2016 11th System of Systems Engineering Conference (SoSE), Kongsberg, Norway, 12–16 June 2016. [Google Scholar]
- Chen, Z.; Hou, P.; Ye, Z. Modeling of robotic fish propelled by a servo/IPMC hybrid tail. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018. [Google Scholar]
- Chen, Z.; Hou, P.; Ye, Z. Robotic fish propelled by a servo motor and ionic polymer-metal composite hybrid tail. J. Dyn. Syst. Meas. Control 2019, 141, 071001. [Google Scholar] [CrossRef]
- Chen, D.; Wu, Z.; Meng, Y.; Tan, M.; Yu, J. Development of a high-speed swimming robot with the capability of fish-like leaping. IEEE ASME Trans. Mechatron. 2022, 27, 3579–3589. [Google Scholar] [CrossRef]
- Clapham, R.J.; Hu, H. iSplash-MICRO: A 50mm robotic fish generating the maximum velocity of real fish. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014. [Google Scholar]
- Ay, M.; Korkmaz, D.; Ozmen Koca, G.; Bal, C.; Akpolat, Z.H.; Bingol, M.C. Mechatronic design and manufacturing of the intelligent robotic fish for bio-inspired swimming modes. Electronics 2018, 7, 118. [Google Scholar] [CrossRef]
- Hu, Y.; Wang, L.; Zhao, W.; Wang, Q.; Zhang, L. Modular design and motion control of reconfigurable robotic fish. In Proceedings of the 2007 46th IEEE Conference on Decision and Control, New Orleans, LA, USA, 12–14 December 2007. [Google Scholar]
- Szymak, P.; Morawski, M.; Malec, M. Conception of research on bionic underwater vehicle with undulating propulsion. Solid State Phenom. 2011, 180, 160–167. [Google Scholar] [CrossRef]
- Zhou, C.; Chong, C.W.; Zhong, Y.; Low, K.H. Robust gait control for steady swimming of a carangiform fish robot. In Proceedings of the 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Singapore, 14–17 July 2009. [Google Scholar]
- Shin, K.J. Robot fish tracking control using an optical flow object-detecting algorithm. IEEE Trans. Smart Process. Comput. 2016, 5, 375–382. [Google Scholar]
- Roy Chowdhury, A.; Prasad, B.; Vishwanathan, V.; Kumar, R.; Panda, S.K. Kinematics study and implementation of a biomimetic robotic-fish underwater vehicle based on Lighthill slender body model. In Proceedings of the 2012 IEEE/OES Autonomous Underwater Vehicles (AUV), Southampton, UK, 24–27 September 2012. [Google Scholar]
- Chowdhury, A.R.; Panda, S.K. Brain-map based carangiform swimming behaviour modeling and control in a robotic fish underwater vehicle. Int. J. Adv. Robot. Syst. 2015, 12, 52. [Google Scholar] [CrossRef]
- Ren, Q.; Xu, J.; Li, X. A data-driven motion control approach for a robotic fish. J. Bionic Eng. 2015, 12, 382–394. [Google Scholar] [CrossRef]
- Liu, J.; Dukes, I.; Hu, H. Novel mechatronics design for a robotic fish. In Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, AB, Canada, 2–6 August 2005. [Google Scholar]
- Shin, K.; Musunuri, Y.R. Design of aquarium robot world using detecting fish robot position method. In Proceedings of the 2016 International Conference on Electronics, Information, and Communications (ICEIC), Da Nang, Vietnam, 27–30 January 2016. [Google Scholar]
- Zhou, Z.; Liu, J.; Pan, J.; Yu, J. Proactivity of fish and leadership of self-propelled robotic fish during interaction. Bioinspir. Biomim. 2023, 18, 036011. [Google Scholar] [CrossRef]
- Phamduy, P.; LeGrand, R.; Porfiri, M. Robotic fish: Design and characterization of an interactive iDevice-controlled robotic fish for informal science education. IEEE Robot. Autom. Mag. 2015, 22, 86–96. [Google Scholar] [CrossRef]
- Soltan, K.; O’Brien, J.; Dusek, J.; Berlinger, F.; Nagpal, R. Biomimetic actuation method for a miniature, low-cost multi-jointed robotic fish. In Proceedings of the OCEANS 2018 MTS/IEEE Charleston, Charleston, SC, USA, 22–25 October 2018. [Google Scholar]
- Zuo, W.; Fish, F.; Chen, Z. Bio-inspired design, modeling, and control of robotic fish propelled by a double-slider-crank mechanism driven tail. J. Dyn. Syst. Meas. Control 2021, 143, 121005. [Google Scholar] [CrossRef]
- Lu, B.; Zhou, C.; Wang, J.; Zhang, Z.; Tan, M. Toward swimming speed optimization of a multi-flexible robotic fish with low cost of transport. IEEE Trans. Autom. Sci. Eng. 2024, 21, 2804–2815. [Google Scholar] [CrossRef]
- Farideddin Masoomi, S.; Gutschmidt, S.; Chen, X.; Sellier, M. The kinematics and dynamics of undulatory motion of a tuna-mimetic robot. Int. J. Adv. Robot. Syst. 2015, 12, 83. [Google Scholar] [CrossRef]
- Na, K.I.; Jeong, I.B.; Han, S.; Kim, J.H. Target following with a vision sway compensation for robotic fish Fibo. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Phuket, Thailand, 7–11 December 2011. [Google Scholar]
- Yu, B.; Pavlenko, P. Research and design of three-joint bionic fish based on BCF model. Preprints 2025. [Google Scholar] [CrossRef]
- Solanki, P.B.; Dutta, S.; Behera, L. Design and 3D simulation of a Robotic Fish. In Proceedings of the Advances in Control and Optimization of Dynamic Systems (ACODS), Bangalore, India, 16–18 February 2012; pp. 1–6. [Google Scholar]
- Li, Z.; Ge, L.; Xu, W.; Du, Y. Turning characteristics of biomimetic robotic fish driven by two degrees of freedom of pectoral fins and flexible body/caudal fin. Int. J. Adv. Robot. Syst. 2018, 15, 172988141774995. [Google Scholar] [CrossRef]
- Minh-Thuan, L.; Truong-Thinh, N.; Ngoc-Phuong, N. Study of artificial fish bladder system for robot fish. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Phuket, Thailand, 7–11 December 2011. [Google Scholar]
- Omari, M.; Ghommem, M.; Romdhane, L.; Hajj, M.R. Performance analysis of bio-inspired transformable robotic fish tail. Ocean Eng. 2022, 244, 110406. [Google Scholar] [CrossRef]
- Ryuh, Y.S.; Yang, G.H.; Liu, J.; Hu, H. A school of robotic fish for mariculture monitoring in the sea coast. J. Bionic Eng. 2015, 12, 37–46. [Google Scholar] [CrossRef]
- Jia, Y.; Wang, L. Leader–follower flocking of multiple robotic fish. IEEE/ASME Trans. Mechatron. 2014, 20, 1372–1383. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, L.; Wang, T.; Zhang, B. Research and experiments on electromagnetic-driven multi-joint bionic fish. Robotica 2022, 40, 720–746. [Google Scholar] [CrossRef]
- Hu, H.; Liu, J.; Dukes, I.; Francis, G. Design of 3D swim patterns for autonomous robotic fish. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 2406–2411. [Google Scholar]
- Wang, C.; Cao, M.; Xie, G. Antiphase formation swimming for autonomous robotic fish. IFAC Proc. Vol. 2011, 44, 7830–7835. [Google Scholar] [CrossRef]
- Yu, J.; Wang, K.; Tan, M.; Zhang, J. Design and control of an embedded vision guided robotic fish with multiple control surfaces. Sci. World J. 2014, 2014, 631296. [Google Scholar] [CrossRef]
- Zhao, W.; Yu, J.; Fang, Y.; Wang, L. Development of multi-mode biomimetic robotic fish based on central pattern generator. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 3891–3896. [Google Scholar]
- Yu, J.; Tan, M.; Wang, S.; Chen, E. Development of a biomimetic robotic fish and its control algorithm. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2004, 34, 1798–1810. [Google Scholar] [CrossRef]
- Yu, J.; Wang, C.; Xie, G. Coordination of multiple robotic fish with applications to underwater robot competition. IEEE Trans. Ind. Electron. 2015, 63, 1280–1288. [Google Scholar] [CrossRef]
- Su, Z.; Yu, J.; Tan, M.; Zhang, J. Implementing flexible and fast turning maneuvers of a multijoint robotic fish. IEEE/ASME Trans. Mechatron. 2013, 19, 329–338. [Google Scholar] [CrossRef]
- Han, J.; Fu, Z.; Zhang, Y.; Shi, L.; Kang, R.; Dai, J.S.; Song, Z. Undulatory motion of sailfish-like robot via a new single-degree-of-freedom modularized spatial mechanism. Mech. Mach. Theory 2024, 191, 105502. [Google Scholar] [CrossRef]
- Koca, G.O.; Korkmaz, D.; Bal, C.; Akpolat, Z.H.; Ay, M. Implementations of the route planning scenarios for the autonomous robotic fish with the optimized propulsion mechanism. Measurement 2016, 93, 232–242. [Google Scholar] [CrossRef]
- Korkmaz, D.; Akpolat, Z.H.; Soygüder, S.; Alli, H. Dynamic simulation model of a biomimetic robotic fish with multi-joint propulsion mechanism. Trans. Inst. Meas. Control 2015, 37, 684–695. [Google Scholar] [CrossRef]
- Yu, J.; Tan, M.; Zhang, J. Fish-inspired swimming simulation and robotic implementation. In Proceedings of the ISR 2010 (41st International Symposium on Robotics) and ROBOTIK 2010 (6th German Conference on Robotics), Munich, Germany, 7–9 June 2010; pp. 1–6. [Google Scholar]
- Shuai, P.; Li, H.; Luo, Y.; Deng, L. Reinforcement Learning Methods in Robotic Fish: Survey. In Proceedings of the 2024 43rd Chinese Control Conference (CCC), Kunming, China, 28–31 July 2024; pp. 4270–4277. [Google Scholar]
- Tong, R.; Feng, Y.; Wang, J.; Wu, Z.; Tan, M.; Yu, J. A survey on reinforcement learning methods in bionic underwater robots. Biomimetics 2023, 8, 168. [Google Scholar] [CrossRef] [PubMed]
- Cui, X.; Sun, B.; Zhu, Y.; Yang, N.; Zhang, H.; Cui, W.; Fan, D.; Wang, J. Enhancing efficiency and propulsion in bio-mimetic robotic fish through end-to-end deep reinforcement learning. Phys. Fluids 2024, 36, 031910. [Google Scholar] [CrossRef]
- Liu, J.; Hu, H.; Gu, D. A hybrid control architecture for autonomous robotic fish. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 312–317. [Google Scholar]
- Chen, X.; Tian, X.; Chen, J.; Hu, Q.; Wen, B.; Liu, X.; Chen, Y.; Tang, S. A reinforcement learning-based control approach with lightweight feature for robotic fish heading control in complex environments: Real-world training. Ocean Eng. 2025, 335, 121667. [Google Scholar] [CrossRef]
- Chen, X.; Wen, B.; Tian, X.; Sun, S.; Wang, P.; Li, X. Reinforcement learning based CPG-controlled method with high adaptability and robustness: An experimental study on a robotic fishtail. Ocean Eng. 2023, 289, 116259. [Google Scholar] [CrossRef]
- Lin, L.; Xie, H.; Zhang, D.; Shen, L. Supervised neural Q-learning based motion control for bionic underwater robots. J. Bionic Eng. 2010, 7, S177–S184. [Google Scholar] [CrossRef]
- Lin, L.; Xie, H.; Shen, L. Application of reinforcement learning to autonomous heading control for bionic underwater robots. In Proceedings of the 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia, 9–12 December 2009; pp. 2486–2490. [Google Scholar]
- Liu, J.; Zhao, J.; Liu, Z.; Liu, Y.; Ren, Y.; An, D.; Wei, Y. Development and application of the coverage path planning based on a biomimetic robotic fish. J. Field Robot. 2025, 42, 2512–2531. [Google Scholar] [CrossRef]
- Ma, L.; Yue, Z.; Zhang, R. Path tracking control of hybrid-driven robotic fish based on deep reinforcement learning. In Proceedings of the 2020 IEEE International Conference on Mechatronics and Automation (ICMA), Beijing, China, 13–16 October 2020; pp. 815–820. [Google Scholar]
- Chen, P.; Wang, F.; Liu, S.; Yu, Y.; Yue, S.; Song, Y.; Lin, Y. Modeling collective behavior for fish school with deep Q-networks. IEEE Access 2023, 11, 36630–36641. [Google Scholar] [CrossRef]
- Zhu, Y.; Tian, F.B.; Young, J.; Liao, J.C.; Lai, J.C. A numerical study of fish adaptation behaviors in complex environments with a deep reinforcement learning and immersed boundary–lattice Boltzmann method. Sci. Rep. 2021, 11, 1691. [Google Scholar] [CrossRef]
- Dong, H.; Wu, Z.; Meng, Y.; Tan, M.; Yu, J. Gliding motion optimization for a biomimetic gliding robotic fish. IEEE/ASME Trans. Mechatron. 2021, 27, 1629–1639. [Google Scholar] [CrossRef]
- Sun, Y.; Yan, C.; Xiang, X.; Zhou, H.; Tang, D.; Zhu, Y. Towards end-to-end formation control for robotic fish via deep reinforcement learning with non-expert imitation. Ocean Eng. 2023, 271, 113811. [Google Scholar] [CrossRef]
- Duraisamy, P.; Santhanakrishnan, M.N.; Rengarajan, A. Design of deep reinforcement learning controller through data-assisted model for robotic fish speed tracking. J. Bionic Eng. 2023, 20, 953–966. [Google Scholar] [CrossRef]
- Xu, D.; Feng, H.; Qiao, F.; Lu, K.; Hu, X.; Liu, Y. Propulsion Control of Bionic Robotic Fish Based on Deep Deterministic Policy Gradient Algorithm. IEEE Open J. Ind. Electron. Soc. 2025, 6, 1496–1507. [Google Scholar] [CrossRef]
- Rodwell, C.; Tallapragada, P. Physics-informed reinforcement learning for motion control of a fish-like swimming robot. Sci. Rep. 2023, 13, 10754. [Google Scholar] [CrossRef] [PubMed]
- Yu, J.; Wu, Z.; Yang, X.; Yang, Y.; Zhang, P. Underwater target tracking control of an untethered robotic fish with a camera stabilizer. IEEE Trans. Syst. Man Cybern. Syst. 2020, 51, 6523–6534. [Google Scholar] [CrossRef]
- Vu, Q.T.; Duong, V.T.; Nguyen, H.H.; Nguyen, T.T. Optimization of swimming mode for elongated undulating fin using multi-agent deep deterministic policy gradient. Eng. Sci. Technol. Int. J. 2024, 56, 101783. [Google Scholar] [CrossRef]
- Yu, F.; Wu, Z.; Wang, J.; Yu, L.; Feng, Y.; Tan, M.; Yu, J. Learning From Fish: A Two-Stage Transfer Learning Method for a Bionic Robotic Fish. IEEE Trans. Autom. Sci. Eng. 2025, 22, 18796–18808. [Google Scholar] [CrossRef]
- Tian, Q.; Li, J.; Ran, G.; Li, H.; Ma, W. Path planning based on improved deep Q-network algorithm for bionic robotic fish with ocean currents. Neurocomputing 2025, 653, 131173. [Google Scholar] [CrossRef]
- Shan, Y.; Bayiz, Y.E.; Cheng, B. Efficient thrust generation in robotic fish caudal fins using policy 1303 search. IET Cyber-Syst Robot. 2019, 1, 38–44. [Google Scholar] [CrossRef]










| Type | Length/Speed | Material | Joint Position/Actuation | Energy Efficiency (with Duration) | Reliability | Cost (Approx., Representative Systems) | Key Limitations |
|---|---|---|---|---|---|---|---|
| Single-joint [2,13,25,26,27,28,29,30,31,32,33,34] | 5–50 cm; typically ≤1.5 BL/s | Polymer body (ABS/PLA); simple plastic or Mylar caudal fin | Single actuated joint at caudal peduncle; small servo or DC motor | Low (~20–90 W); tail-only oscillation with high slip losses; >24 h continuous operation | High; minimal mechanical parts; simple sealing; robust for long-term operation | $200–$600 (educational/interactive robotic fish; embedded-vision guided single-tail systems) | Overly simplified hydrodynamics; inefficient steady swimming; weak maneuverability; unsuitable for studying body–wave dynamics |
| Two-joint [3,34,35,36,37,38,39,40,41,42,43,44,45] | 30–50 cm; ~0.8–2.5 BL/s | 3D printed polymer body; aluminum or polymer links; silicone tail | Two serial joints at mid-body and tail base; geared servos | Medium (up to 100 W); partial body deformation improves thrust; ~1–3 h operation | Medium–High; moderate mechanical and sealing complexity; stable in lab tests | $800–$1800 (two-joint laboratory platforms for CPG, antiphase, and formation studies) | Body wave remains discontinuous; limited transport efficiency; scaling to complex swimming patterns constrained |
| Three-joint [4,46,47,48,49,50,51,52,53,54,55,56,57,58,59] | 40–70 cm; ~0.9–3.0 BL/s | Waterproof polymer shell; reinforced metal or composite joints; elastomeric tail | Three serial joints forming posterior body wave; servo or electromagnetic actuators | Medium–High (~100 W); closer to carangiform BCF motion; ~1–2 h operation | Medium; sensitive to phase mismatch, backlash, and joint sealing degradation | $2000–$4500 (ICHTUS 3-DOF fish; Fibo-series prototypes; environmental monitoring fish) | Control tuning becomes critical; performance strongly phase-dependent; learning-based control stability decreases |
| Multi-joint [6,10,60,61,62,63,64,65,66,67,68,69,70] | 50–125 cm; ~1.0–4.0 BL/s | Modular sealed body segments; aluminum or carbon-fiber frame; flexible caudal fin | ≥4 distributed hinge joints along flexible tail; one actuator per joint | High (>150 W); near-continuous traveling body wave yields best transport efficiency; ~0.5–2 h operation | Medium–Low; increased actuator count, sealing points, and wiring raise failure probability | $5000–$12,000+ (four-joint carangiform robotic fish; electromagnetic multi-joint fish; leader–follower platforms) | High-dimensional control space; strong hydrodynamic coupling; high computation and calibration cost; sensitive to noise and sim-to-real gap |
| Algorithm | Policy/Model | Learning Task | Sensors Used | Performance Tasks |
|---|---|---|---|---|
| Q-Learning [65,74,75,76,77,78] | Off-policy & model-free | Optimal value function | IMU; IR proximity sensor; pressure sensor; visual tracking camera | Heading stabilization; thrust optimization; cooperative swimming; obstacle avoidance |
| DQN [79,80,81,82,83,84] | Off-policy & model-free | Optimal value function | IMU; pressure sensor; water-current sensor; visual tracking camera | Path planning; formation control; energy-efficient navigation; trajectory tracking |
| DDPG [85,86,87,88,89,90] | Off-policy & model-free | Continuous control via actor–critic policy learning | IMU; pressure sensor; hydrodynamic force sensor; visual tracking camera | Propulsion-efficiency optimization; adaptive stiffness control; trajectory tracking; energy-efficient navigation; target tracking |
| Reference | Joint Architecture | Target Task | Sensors | Continuous Control Variables | Evaluation Metrics | Experimental Conditions | Limitations | |
|---|---|---|---|---|---|---|---|---|
| Q-learning | Yu et al. [65] | Multi-joint robotic fish (3 joints) | Cooperative behavior learning | Vision-based global tracking; posture estimation | Linear speed ω; angular speed ω (via fuzzy inference) | Cumulative reward; time steps to win; task success rate | Real-world pool experiments (2 vs. 2 robotic fish competition) | Discrete state/action spaces; learning relies on vision; centralized control |
| Linet et al. [77,78] | Single-body rigid bionic robot (no joints) | Heading control | Virtual (simulation-based) | Fin waveform parameters (amplitude, frequency) | Tracking stability; learning efficiency; motion accuracy | Numerical simulation | Early feasibility study; no physical sensing reported | |
| Chen et al. [75,76] | 1-DOF robotic fish | Motion control | No sensors | CPG neutral position; flapping parameters | Heading accuracy; stability; robustness | Simulation + prototype | Used mainly for feasibility validation; no sensing-based learning | |
| DQN | Chen et al. [81] | Abstract fish-schooling agents | Collective behavior | Relative-position sensing | Discrete heading-angle change mapped to continuous velocity update | Inter-agent distance error; polarization order; collision avoidance rate | Multi-agent numerical simulation | Agents rely on relative distance/angle; states derived in simulation |
| Sun et al. [84] | Tail-driven robotic fish (implicit single-joint) | Formation control | Virtual position and velocity sensing | Discrete heading and speed commands mapped to tail-beat motion | Formation error; inter-agent distance deviation; convergence time | CFD-based simulation with imitation learning | State variables derived from CFD flow field and relative positions | |
| Tian et al. [91] | Low-DOF bionic robotic fish | Path planning | Virtual flow-field and position sensing | Heading and speed under ocean currents | Path length; travel time; path smoothness | Simulation with ocean current disturbances | Ocean-current effects are explicitly modeled; sensing is environment-defined | |
| DDPG | Ma et al. [80] | Hybrid-driven robotic fish (~2–3 DOF) | Path tracking | IMU (orientation); velocity estimation (simulation) | Heading angle; pitch angle; continuous DDPG outputs | Absolute tracking error (XYZ); MSE; MAPE; path-following accuracy | Dynamic simulation | Continuous states include pose and velocity; no physical sensors used |
| Duraisamy et al. [85] | Multi-joint robotic fish | Speed tracking | IMU; joint encoders; motor sensors | Tail oscillation amplitude & frequency | ISE; IAE; ITAE; speed tracking error | Data-assisted model; real experiments | Experimental platform reports onboard sensing | |
| Xu et al. [86] | 4-joint robotic fish | Propulsion control | Virtual hydrodynamic force sensing | Joint stiffness (continuous modulation) | Propulsion efficiency; power consumption | High-fidelity CFD simulation | Physics-based sensing variables extracted from CFD solver | |
| Cui et al. [73] | Two-joint bionic robotic fish | Propulsion efficiency optimization | IMU; joint angle sensors; thrust estimation | Joint angle increments; body lateral displacement | Thrust coefficient; propulsion efficiency; power consumption; reward | Simulation + real-world experiments | Sensors used for closed-loop stiff-ness optimization experimentally | |
| Enhanced DDPG | Vu et al. [89] | Elongated undulating fin robot | Swimming motion optimization | Virtual fin-ray state sensing; force estimation | Oscillatory amplitude of H-CPG for each fin ray | Thrust; propulsive efficiency; accumulated reward | Multi-agent CFD simulation + experiments | Each fin ray treated as an agent with local state feedback |
| Yu et al. [90] | Four-joint bionic robotic fish | Sim-to-real control transfer | IMU; joint encoders; motion capture (offline) | Joint oscillation amplitudes & swing speed | Acceleration capability; swimming efficiency; maneuverability | Two-stage sim-to-real learning | Offline sensing data used to bridge sim–real gap |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dinh, N.; Vosbein, D.; Wang, Y.; Cui, Q. Reinforcement Learning-Enabled Control and Design of Rigid-Link Robotic Fish: A Comprehensive Review. Sensors 2026, 26, 996. https://doi.org/10.3390/s26030996
Dinh N, Vosbein D, Wang Y, Cui Q. Reinforcement Learning-Enabled Control and Design of Rigid-Link Robotic Fish: A Comprehensive Review. Sensors. 2026; 26(3):996. https://doi.org/10.3390/s26030996
Chicago/Turabian StyleDinh, Nhat, Darion Vosbein, Yuehua Wang, and Qingsong Cui. 2026. "Reinforcement Learning-Enabled Control and Design of Rigid-Link Robotic Fish: A Comprehensive Review" Sensors 26, no. 3: 996. https://doi.org/10.3390/s26030996
APA StyleDinh, N., Vosbein, D., Wang, Y., & Cui, Q. (2026). Reinforcement Learning-Enabled Control and Design of Rigid-Link Robotic Fish: A Comprehensive Review. Sensors, 26(3), 996. https://doi.org/10.3390/s26030996
