1. Introduction
The high economic value of sea cucumber products has led to the rapid development of sea cucumber aquaculture [
1,
2]. During the sea cucumber farming process, real-time recognition and localization of sea cucumbers play a vital role in monitoring their growth status and facilitating the capture of farmed sea cucumbers. Currently, underwater manual operations are the primary means for sea cucumber monitoring and harvesting. However, prolonged underwater operations pose significant risks to personnel due to factors such as high pressure and low-temperature [
3]. Therefore, highly intelligent autonomous underwater robots offer convenience for underwater mobile monitoring and harvesting [
4,
5]. The traditional autonomous underwater robots are commonly driven by propellers during underwater operations. They are prone to entanglement with aquatic vegetation and suffer from disadvantages such as low propulsion efficiency and high noise, which cause significant disturbance to aquatic organisms [
6]. In contrast, fish species have evolved physiological structures and functional characteristics adapted to their survival environment through long-term natural evolution. Therefore, bio-inspired underwater robots, which mimic biological morphological structures and locomotion mechanisms, exhibit advantages such as high maneuverability, low noise, high stability, and efficiency [
7]. To enhance sea cucumber farming efficiency, as well as monitoring and harvesting efficiency, and reduce the environmental interference caused by underwater operations, it is necessary to design a highly maneuverable bio-inspired underwater robot with visual perception capabilities. By integrating high-precision object detection algorithms and edge computing, real-time monitoring, recognition, and localization of sea cucumber health can be achieved, thus further reducing labor costs and operational risks in the sea cucumber aquaculture industry.
Although fish species vary in terms of their types and body shapes, their swimming patterns can be primarily categorized into two modes based on the source of propulsion: Body and/or Caudal Fin (BCF) mode and Median and/or Paired Fin (MPF) mode [
8]. The BCF mode primarily generates thrust through undulations of the body and oscillations of the caudal fin, while the MPF mode utilizes the undulations of the pectoral fins, pelvic fins, and other fin surfaces to provide propulsion. Therefore, the BCF mode excels in speed compared to the MPF mode. The MPF mode combines high propulsion efficiency, maneuverability, and stability [
9], enabling agile maneuvers such as low-speed turning and rapid acceleration. Robotic fish using the BCF mode inevitably exhibit lateral body movements during swimming, which significantly affect the quality of image capture. On the other hand, the MPF mode demonstrates superior disturbance resistance, making it more suitable for underwater mobile monitoring platforms equipped with cameras and other electro-optical sensors. In nature, the manta ray’s swimming motion is often compared to bird flight, representing a typical example of the MPF mode [
10]. These motivate researchers and engineers to create bio-inspired designs that can outperform current state-of-the-art underwater robots in maneuverability and stability. The robotic manta ray, characterized by its agile swimming, gliding capabilities, and exceptional stability, is an ideal biomimetic model for underwater robots equipped with various electro-optical sensors to perform agile underwater tasks. It provides a stable mobile platform for diverse underwater activities.
Extensive research on underwater bio-inspired robotic fish based on the manta ray had been conducted both domestically and internationally. As early as 2002, Davis [
11] from Columbia University designed a biomimetic pectoral fin prototype using Shape Memory Alloy (SMA) as a linear actuator. With further research, numerous manta ray prototypes have been developed based on different design principles, considering their motion performance and external morphology [
7]. These prototypes include pectoral fins with simple structures, fewer degrees of freedom (DOFs), and high skeletal rigidity. For instance, the RoMan-I prototype from Nanyang Technological University utilized motor-driven rigid fin rays as pectoral fins [
12]. Beihang University developed the Robo-ray I-III, which incorporated carbon fiber as the material for the pectoral fins [
13,
14,
15,
16]. Some researchers had also focused on flexible pectoral fins that mimic morphological characteristics. However, these designs lack propulsive force. For example, the Institut Supérieur de Mécanique de Paris had developed a miniature flexible robotic manta ray using Dielectric Elastomer Minimum Energy Structures (DEMES) material [
17]. Considering the impact of manta ray size on application scenarios, larger robotic manta rays exhibit superior performance and reliability, making them suitable for carrying large-scale sensing equipment. However, these large-scale designs were characterized by complex structures, high costs, larger dimensions, and reduced stealth capabilities, making them less suitable as underwater monitoring platforms for aquaculture. In contrast, smaller robotic manta rays offer advantages such as smaller size, simple structures, lower costs, and higher stealth capabilities, which enhance their engineering practicality. For example, Northwestern Polytechnical University developed a maneuverable robotic manta ray using a zigzag spring support structure, enabling it to perform maneuvers with arbitrary radii [
18]. In terms of visual perception, researchers have also conducted corresponding studies on manta rays. The Automation Institute of the Chinese Academy of Sciences developed a robotic manta ray equipped with a visual system and proposed an algorithmic framework for real-time digital video stabilization [
19]. Northwestern Polytechnical University achieved manta ray relative positioning by combining improved target detection algorithms and binocular distance measurement using a robotic manta ray equipped with dual cameras [
20]. The fusion of visual perception and deep learning techniques in robotic fish will be a future development trend for underwater bio-inspired robots.
With the advancement of deep learning and edge computing technologies, combined with lightweight image processing algorithms, robotic fish with visual perception capabilities can achieve real-time online processing of image data. Object detection is an important means of visual perception for underwater robotic fish. Convolutional neural network-based object detection algorithms can be divided into two-stage and one-stage algorithms. Two-stage algorithms, mainly represented by the RCNN series [
21,
22,
23], achieve higher detection accuracy but have slower processing speeds. One-stage algorithms, mainly represented by the YOLO series [
24,
25,
26,
27] and SSD series [
28,
29], have faster inference speeds. In recent years, with the advantages of Transformer in global feature extraction, it has been successfully applied to dense prediction tasks [
30,
31]. For example, the Swin Transformer [
32] constructed a pyramid structure with gradually decreasing resolutions to realize feature learning based on the Transformer at multiple scales and extract short-range and long-range visual information. Experimental results demonstrated the superiority of this algorithm. Exploring lightweight and high-precision object detection algorithms to be embedded in bio-inspired robotic manta rays is particularly important for enhancing their visual perception capabilities. Additionally, localization algorithms based on binocular vision and semi-global block matching (SGBM) [
33,
34] will provide stereo visual perception capabilities for underwater robots.
Therefore, the objective of this paper is to design and implement a small bio-inspired manta ray with visual perception capabilities and a rigid-flexible coupled pectoral fin. It aims to enable sea cucumber recognition, localization, and approach, thus establishing the foundation for monitoring the activity status of sea cucumbers and subsequent automated harvesting. The main contributions can be summarized as follows:
Designing a novel robotic manta ray with visual perception capabilities and a rigid-flexible coupled pectoral fin.
Improving the YOLOv5s object detection and incorporating binocular stereo-matching algorithms to achieve accurate sea cucumber identification and localization.
Designing a fuzzy PID controller to realize depth control, direction control, and target approach motion control for the robotic manta ray.
The remaining structure of this paper is as follows:
Section 2 elaborates on the overall electromechanical design of the rigid-flexible coupled pectoral fin bio-inspired manta ray. In
Section 3, the sea cucumber recognition and localization algorithm based on the improved YOLOv5s object detection and SGBM binocular stereo matching is introduced.
Section 4 focuses on the depth control, direction control, and approach motion control of the manta ray based on localization information. Experimental results of the sea cucumber recognition and localization algorithm, as well as the depth control, direction control, and approach motion control of the manta ray, are presented in
Section 5.
Section 6 provides a discussion of the research presented in this paper. Finally,
Section 7 concludes the entire paper with a comprehensive summary.
2. Overview of Robotic Manta Ray
The manta ray, as a typical fish utilizing the MPF mode of propulsion, exhibits outstanding stability and maneuverability during motion [
35]. It also demonstrates remarkable agility and disturbance resistance at low speeds, making it highly suitable for carrying various optoelectronic sensors and performing flexible maneuvers underwater. The undulatory fins of the manta ray inspire the propulsor design of the robotic manta ray.
To ensure the integrity and consistency of the bio-inspired robotic manta ray, a top-down design approach is employed for the mechanical structure design. First, the overall shape of the bio-inspired robotic manta ray is designed from a holistic perspective. Second, considering the practical requirements, functionalities, performance, and constraints of the entire system, the bio-inspired robotic manta ray is decomposed into three separate sub-components: pectoral fins, caudal fin, and body shell. Finally, employing a local design approach, each component module with different functionalities is gradually refined and designed.
The bio-inspired robotic manta ray operates underwater in a marine environment; therefore, the materials used must possess characteristics such as lightweight, high strength, corrosion resistance, good plasticity, and ease of processing [
36]. Considering the compressive strength and corrosion resistance of the resin [
37], the black resin is chosen for constructing the body shell of the robotic manta ray. This paper analyzes the shape characteristics of manta rays based on the propulsion mode of manta rays in nature and knowledge from biomimetics. The mechanical structure of the bio-inspired robotic manta ray is rationally simplified. Based on this analysis, the design parameters for the caudal fin, rigid body shell, and rigid-flexible coupled pectoral fins are determined.
Figure 1a illustrates the overall rendering of the bio-inspired robotic manta ray, and
Figure 1b shows the prototype of the bio-inspired robotic manta ray.
Table 1 provides the technical parameters of the bio-inspired robotic manta ray.
2.1. Internal Layout of Robotic Manta Ray
The rigid shell of the robotic manta ray provides ample space for accommodating various electronic devices, control components, and batteries. The internal layout, as shown in
Figure 2, includes four sets of 7.4 V lithium batteries positioned at the central bottom of the shell to lower the center of gravity and ensure balance. Above the battery compartment, the controller, inertial measurement unit (IMU), and battery level monitoring module are placed at a relatively higher position to protect the electronic components from direct damage in case of accidental water ingress. The attitude sensor is centrally located within the internal space of the shell, accurately capturing the manta ray’s posture. The power module is connected to a separate battery compartment through support pillars at the bottom of the shell, providing both convenience of connection and waterproofing functionality. The machine vision computing module, equipped with a Jetson Xavier NX board, is located at the back of the robotic manta ray, powered by a dedicated 14.8 V battery. The two buoyancy balance units, positioned on both sides of the robotic manta ray, serve to adjust the center of gravity, thereby increasing stability and balancing buoyancy forces.
The bottom layout of the robotic manta ray is depicted in
Figure 3. The waterproof electric switch, charging port, and depth sensor are positioned within the central groove of the robotic manta ray. This design can avoid affecting the overall hydrodynamic performance. The binocular camera, as shown in
Figure 3b, is externally mounted on the bottom of the robotic manta ray, facilitating easy disassembly and expansion.
2.2. Pectoral Fin Undulation Design
The pectoral fin is the most crucial locomotion organ of the manta ray [
38] and serves as the core design element in the robotic manta ray. According to relevant biological research, the complex and flexible deformation of the pectoral fin during stable cruising can be decomposed into the superposition of two orthogonal traveling waves [
39]. As shown in
Figure 4, traveling wave I propagate from the base to the tip of the pectoral fin along the span direction, while traveling wave II approximately propagates from the head to the tail along a chord parallel to the water flow. By coordinating these two sets of traveling waves, the manta ray achieves efficient and agile motion.
Inspired by this, this paper proposes a bio-inspired manta ray pectoral fin design scheme, where the propulsion mechanism of the pectoral fin employs a simple configuration of two pairs of fin strips and a flexible membrane wing. The overall structure of the pectoral fin is illustrated in
Figure 5a. Each pectoral fin is equipped with two digital servos capable of continuous bidirectional rotation from 0 to 180 degrees, enabling independent or synchronized control. This design scheme allows for switching between undulating and flapping propulsion modes. The servo motion of the pectoral fin follows a sinusoidal pattern as described by Equation (
1), where
represents the angular motion of the front servo,
represents the angular motion of the rear servo,
represents the phase difference between the front and rear servos, and
represents the servo bias angle.
The undulation propulsion mode, depicted in
Figure 5b, involves a 0.2 ms delay between the activation of the front and rear fin strips. The two fin strips have equal amplitudes and maintain a certain phase difference, resulting in periodic oscillations that drive the rubber membrane wing to create the undulating motion. The flapping propulsion mode, illustrated in
Figure 5c, involves simultaneous activation of both fin strips. The front fin strip has a larger amplitude compared to the rear fin strip, resulting in a wave motion that gradually decreases from front to back, propelling the rubber membrane wing forward. As the main propulsion actuator in the MPF propulsion mode, the bio-inspired pectoral fin actuator generates stable and smooth thrust, providing the robotic manta ray with precise control forces for subtle adjustments during motion control. The designed motion of the bio-inspired robotic manta ray is achieved by four driving servos that actuate the two pairs of fin strips to perform cyclic oscillations. The rigid-flexible coupling design ensures the correct temporal sequence of pectoral fin motions while incorporating a certain level of passive flexibility to reduce resistance and increase radial force. The soft membrane wing undergoes passive deformation under the combined action of the active fin strips and water damping, generating a propulsion wave that propagates in the opposite direction, propelling the robotic manta ray forward. The front and rear pairs of fin strips enable precise control of the wave motion of the pectoral fin. Compared to other bio-inspired fish pectoral fins, the advantage of the proposed rigid-flexible coupling flapping structure lies in its ability to generate multiple motion modes, providing enhanced maneuverability. It also offers faster-flapping motion and greater flexibility in undulating movement, making it well-suited for a wide range of underwater tasks.
4. Depth, Direction and Approach Control
Depth and direction control based on localization information is essential for the monitoring and operational tasks of the biomimetic robotic manta ray in aquaculture environments. Depth control ensures that the robotic manta ray maintains a specific depth in the water, allowing it to perform monitoring tasks within a designated depth range. This enables stable underwater footage, focuses on important scenes, and facilitates detailed inspection. direction control allows the robotic manta ray to move and monitor in specific directions, enabling comprehensive monitoring of aquaculture areas. Equipped with a variety of sensors, the biomimetic robotic manta ray can perceive its underwater state. By effectively integrating depth and direction control algorithms, the autonomy and flexibility of the robotic manta ray in water can be enhanced, enabling it to perform various tasks in complex underwater environments.
A fuzzy controller consists of four main components: fuzzification, fuzzy rule base, fuzzy inference, and defuzzification [
42]. It is the core of a fuzzy control system. Its primary function is to map the input and output variables to membership functions and use a set of fuzzy rules based on empirical knowledge to determine the output. This improves the responsiveness and stability of the system [
43].
In the depth control system, the depth sensor and the set depth value are the inputs to the controller. The error
e and the rate of change in error
ec are calculated based on these inputs. The error and error rate are used in the fuzzy PID controller to compute the modified values of the traditional PID parameters, namely
,
,
. Similarly, in directional control, the inputs to the fuzzy PID controller are the yaw angle error
e and the rate of change in the yaw angle error
ec. The fuzzy PID controller for the approach control consists of a depth controller and directional controller, which is illustrated in
Figure 9. In the approach control system, the binocular camera sends the three-dimensional spatial coordinates of the sea cucumber to the lower-level controller. The lower-level controller utilizes the fuzzy PID controllers for depth and direction control to adjust the fin strip’s bias angle and amplitude for the next motion cycle.
In PID control, the initial values of three parameters,
,
, and
, need to be determined. These initial values can be determined using engineering measurement methods. Referring to Equation (
2), the PID parameters are adjusted based on the correction information from the fuzzy PID controller.
,
, and
represent the initial values of the PID controller, while
,
, and
represent the adjusted output values.
The fuzzy inference designed in this study is based on the fuzzification of the error
e and the error rate
ec, as well as the fuzzy rule base, to derive the fuzzy subsets corresponding to
,
, and
. The three-dimensional surface plots of the fuzzy-inferred output variables are depicted in
Figure 10. It can be observed that the output variables exhibit smooth changes as the input variables vary, which satisfies the basic requirements of fuzzy control rules. In the figure, different colors represent different values of
,
, and
, with yellow representing larger values and blue representing smaller values.