1. Introduction
UUV localisation is one of the most challenging technical problems, hindering the large-scale adoption of autonomous UUVs in aquaculture or any underwater constrained operational workspace. Although autonomous UUV deployments are reported in oceanographic research such as bathymetry scanning, environmental monitoring, and defence applications [
1,
2], they are not widely adopted in constrained operational workspaces (e.g., aquaculture). The open workspace (e.g., open ocean in the oceanographic research) allows for a large error tolerance in UUV localisation. However, this is not the case for the constrained workspace (e.g., an aquaculture farm with mooring lines and other infrastructure), due to the potential safety hazards resulting from the deficiency of UUV localisation in terms of precision, accuracy, and reliability, which can impact aquaculture infrastructure, the human workforce, and the UUV itself. With such concerns, the potential benefits of adopting autonomous UUVs in aquaculture cannot be fully demonstrated to the farm operators, policymakers, and governing authorities. On the other hand, without the financial backing and strong interests from industry and government, due to the lack of further evidence supporting these benefits, researchers encounter limitations in expanding R&D work on autonomous deployment to large-scale validation, which is necessary to produce convincing evidence on technical, environmental, and commercial values [
3]. Therefore, a cost-effective UUV localisation system is one way to address the bottleneck in technology transfer from research to commercial-scale deployment.
Autonomous UUV deployment can benefit aquaculture or any underwater operation in a constrained workspace, such as offshore wind farms and the oil and gas industry, in terms of productivity, reliability, and economically viable frequent operations. For instance, one of the most relevant tasks for autonomous UUVs in aquaculture is the routine visual inspection of fish net-pens [
4,
5]. The New Zealand King Salmon (NZKS) company, the world’s largest producer of King Salmon, deploys a remotely operated underwater vehicle (ROV, a type of UUV) in its fish farms for net-pen visual inspection and cleaning. Such tasks for a square net-pen: 30 m × 30 m × 15 m take
h to complete. Recently, the NZKS’s Blue Endeavour project, the first-of-its-kind offshore salmon farm in New Zealand, was started, located 7 km off Cape Lambert in the Cook Strait [
6]. Aiming to produce 10,000 Metric tonnes harvested annually, a total of twenty circular floating flexible net-pens, where each one has a diameter of
m and a depth of 15 m, are configured into two blocks of a two-by-five layout. Therefore, in performing fish net-pen inspections and cleaning on such a large scale, autonomous UUV deployment is more sustainable and economical in the long term due to the shortage of available trained workers for offshore farms, as well as the need for safety, productivity, and reliability in a harsh working environment.
There is extensive ongoing research on the autonomy and localisation of UUVs for various applications [
4,
7,
8,
9,
10,
11]. However, to the best of our knowledge, there are no fully autonomous UUV deployments on a large scale other than preliminary trials in the aquaculture industry. The standard sensor suite for UUV localisation (e.g., Ultra-Short Baseline (USBL), Doppler Velocity Logger (DVL), Inertial Measurement Unit (IMU), Attitude Heading Reference System (AHRS), and Inertial Navigation System (INS)) are very costly depending on the accuracy, reliability, and performance grade (hobby to industry). As shown in
Table 1, some industry-grade sensors cost substantially more than the hobby-type UUV (e.g., BlueROV2 Heavy Configuration, costing USD 6700), and a few of them, except for the SPRINT-NAV U, need to be fused to acquire UUV localisation. Therefore, the initial huge capital investment in sensors for the R&D work of autonomy and localisation of UUVs is one of the main concerns, in addition to the actual sensor performance in real-world operations. In other words, the cost of an industrial-grade UUV with an industrial-grade sensor suite exceeds the project budgets for many research institutes, except for those focused on marine research or system developers/integrators. Therefore, as mentioned earlier, the hobby-grade UUV is affordable, and with the help of a cost-effective localisation, there are more research outputs to support the potential benefits of autonomous UUV deployment in the aquaculture industry. Subsequently, these research capabilities and demonstrations will attract aquaculture farm operators, policymakers, and government authorities to further invest in UUV research for aquaculture.
In this work, a cost-effective UUV localisation system is proposed using AprilTags, fiducial markers [
25], which require minimal time and effort to set up, with only minor modifications (e.g., the fixture for AprilTags) to the existing infrastructure, unlike other AprilTags-based localisation approaches [
26,
27]. The proposed system is suitable for deployment in aquaculture with various fish pen designs, such as floating or semi-submersible rigid cages, floating flexible fish net-pens with additional rigid body structures, and a closed containment tank system, and is certainly suitable for a controlled laboratory environment. AprilTags with their fixtures must be attached within the UUV-operating zone of the existing infrastructure. For the whole localisation deployment, there are three steps, namely, (1) AprilTags installation, (2) AprilTags extraction and data-logging of the relative poses of AprilTags for frame transformation, and (3) AprilTags poses publishing and localisation. Among these three steps, only the first step requires slightly more time for the manual installation of AprilTags, while the remaining steps can be completed in just a few minutes. The novelties and contributions of the proposed method are listed as follows.
No measurements are required for the initial AprilTag installation (Step 1) as long as a pair of AprilTags is placed, visible to the UUV’s camera at each instant, and every subsequent pair is visible. Therefore, it saves a substantial amount of installation time.
The proposed extraction algorithm (Step 2) captures and updates, in real-time, the relative poses of all AprilTag pairs at each user confirmation and subsequently performs data-logging at the end.
The proposed localisation algorithm (Step 3) publishes the relative poses of AprilTags using the previously logged data and performs the UUV’s localisation using the real-time camera feedback.
The overall deployment of the localisation system is simple, cost-effective, and time-efficient.
This research work provides the initial phase of the full deployment procedure and algorithm development towards the actual underwater deployment in aquaculture infrastructure in the near future.
The remaining part of this article is structured as follows.
Section 2 presents the types of aquaculture infrastructure that are suitable for the proposed cost-effective UUV localisation system.
Section 3 details the deployment of the proposed UUV localisation system, covering three main areas: AprilTags installation, AprilTags extraction and data-logging, and AprilTags pose publishing and localisation.
Section 4 explains the experimental setup.
Section 5 discusses the experimental results.
Section 6 concludes the summary of the research findings and covers future research directions.
2. Aquaculture Infrastructure
In this work, the operational definition of aquaculture infrastructure is the structure inside which fish (e.g., salmon) are cultivated. Generally, it consists of a floating collar, a fish net-pen, a sinker tube, a side rope, a mooring system, and a buoy. There are different types of aquaculture infrastructure, so not all aforementioned components are installed in all types. The following four types of aquaculture infrastructure are considered suitable for deploying the proposed cost-effective localisation system.
2.1. Floating/Semi-Submersible Rigid Cage
Usually, floating/semi-submersible rigid cage designs, such as Ocean Farm 1 and Havfarm 1, are built to operate in a harsh open ocean environment [
28,
29]. A simplified illustration of fish net-pen setup inside the rigid cage design is shown in
Figure 1. Such designs provide the rigid body structures to which AprilTags can be mounted with the additional fixtures.
2.2. Floating Flexible Fish Net-Pen with a Rigid Body Structure
Many aquaculture farms are designed with floating flexible fish net-pens as shown in
Figure 2. Using the floating flexible fish net-pen design, the New Zealand King Salmon company started its trial phase of the first of its kind offshore salmon farm, located 7 km off Cape Lambert in the Cook Strait, in June 2025 [
30]. As the floating collars and other parts of the entire infrastructure are not attached to a rigid body but only to the sea/ocean floor via anchor lines, the existing infrastructure is not suitable for AprilTags installation. The additional rigid body structures, acting as inertial frames, need to be installed so that the AprilTags remain stationary during operation.
2.3. Closed Containment Tank System
Another cage design suitable for the proposed cost-effective localisation system is the closed containment tank system, such as ECO-ARK
® AQUACULTURE 4.0 by Aquaculture Centre of Excellence Pte Ltd. [
31]. Due to the rigid body structure of the tank in a controlled environment, it is one of the most suitable infrastructures for the proposed localisation system. Its simplified version is illustrated in
Figure 3.
2.4. Controlled Laboratory Environment
Just like any engineering research and development work, the proposed cost-effective localisation system must undergo concept generation and testing in a controlled laboratory environment, followed by field trial validation on an industrial scale. For this purpose, another suitable infrastructure is a controlled laboratory environment, as shown in
Figure 4.
For all the aforementioned infrastructure, relatively minor modifications to the existing infrastructure are required for AprilTags installation, with the level of modification depending on the type of aquaculture infrastructure. The focus of this article is not mainly on the AprilTags installation and mechanism design, but on the algorithm implementation. As AprilTags are supposed to act as the inertial frames to which the pose of the UUV, via its camera, is measured in real-time, respectively, little to no movement of AprilTags is an essential requirement for the deployment of the proposed localisation system.
3. UUV Localisation System Deployment
As mentioned previously, there are three main steps for the deployment of the proposed cost-effective localisation system, namely (1) AprilTags installation, (2) AprilTags extraction and data-logging, and (3) AprilTags pose publishing and localisation. In the following subsections, all three steps, together with the overall deployment procedure, will be detailed, along with their advantages.
3.1. AprilTags Installation
Through simple matrix manipulations on the homogeneous transformations, and without the need for a visual odometry-based landmark SLAM [
32], the proposed strategy for AprilTags installation involves no manual measurement of each AprilTag’s pose. In other words, during the installation, the only requirement is that a pair of AprilTags must be placed, visible to the UUV’s camera, and so are the following pairs (e.g., Tags [0, 1], Tags [1, 2]) at any instant of time. It is essential to note that this requirement applies only to Step (2), and the real-time UUV’s localisation only requires at least one AprilTag at any given instant of time.
There are a few possible scenarios of AprilTags installation, such as a setup with multiple cameras and multiple AprilTags as shown in
Figure 5, a setup with a single camera, a single AprilTag attached to the UUV, and multiple stationary AprilTags, as shown in
Figure 6, or a setup with a camera attached to the UUV and multiple stationary AprilTags as shown in
Figure 7. Scenario 1 is suitable for tracking an object of interest (OOI) with multiple AprilTags configured around it within the field of view of the cameras [
33]. However, for underwater applications, this scenario requires a tethered solution to transfer the estimated OOI’s pose or to transfer the transformation data from cameras to the OOI for onboard processing. Scenario 2 also requires a tethered solution to transfer the data of transformations between the single AprilTag attached to the UUV and other stationary AprilTags. Scenario 3 is suitable when the UUV operation requires a tetherless solution, as the transformation data from the camera is directly accessible by the UUV. Therefore, the UUV must be equipped with sufficient computing capacity. In this work, the last scenario will be demonstrated as the UUV’s tethered cable can be entangled within the constrained operation workspace of an aquaculture infrastructure.
Regardless of the aforementioned scenarios, the proposed algorithm for AprilTag extraction and data-logging allows for the ease of installing AprilTags without the need for manual measurements of each AprilTag’s relative pose. This factor substantially reduces the installation time. The only requirement for AprilTags installation is to place a pair of AprilTags within the field of view of the UUV’s camera at each instant of time, and so is the same for other subsequent pairs, as shown in
Figure 8. This requirement is only compulsory for AprilTags extraction and data-logging, and it is not necessary for real-time localisation, which can be carried out once one of the AprilTags is captured in the field of view of the UUV’s camera.
3.2. AprilTags Extraction and Data-Logging
AprilTags extraction and data-logging of the relative poses of AprilTags can be carried out using the UUV’s camera. Alternatively, if a camera with better specifications (e.g., a high resolution, a larger field of view, or a high frame rate) is available, it can be used in place of the UUV’s camera. In either approach, the relative poses of AprilTags are stored in a YAML (Yet Another Markup Language) file at the end of the AprilTags extraction process. The stored YAML file can be utilised later for AprilTags pose publishing and localisation. The overall process of AprilTags extraction and data-logging is illustrated in
Figure 8.
For
n AprilTags, there are
registers that the user needs to confirm by pressing the ’Enter’ key after orienting the camera to fully capture each pair. Therefore, in a series of
registers, for the
i-th register, there are two homogeneous transformations, generated by the existing AprilTag detection system [
26].
where
and
are the rotation matrices from Frame
and Frame
to Frame
, respectively.
and
are the position vectors of the origins (
) of Frame
and Frame
with respect to Frame
, respectively. Note: For ease of readability, the AprilTag Frame
is denoted as
, although the full description is used in the figures.
For the
i-th register in constructing the whole static transformations between each AprilTag,
and
as shown in Equations (
1) and (
2) are required to be further manipulated as follows.
where
Using Equation (
3) for the
i-th register for
n AprilTags, it results in acquiring
registers:
for the whole static transformations between each AprilTag pair.
3.3. AprilTags Pose Publishing and Localisation
In the previous subsection, acquiring
registers:
for
n AprilTags is presented in detail. Although these registers provide the static transformation between each AprilTag pair, the transformation required for localisation is
, where
I represents the world inertial frame, and
i is any AprilTag frame, currently being captured in the field of view of the UUV’s camera. Therefore, all possible
can be pre-computed, as constant matrices, at the start of executing the localisation algorithm, as all
registers are static transformations. In doing so, it avoids unnecessary real-time computing of the constant matrices. To compute all the possible
, the world inertial frame needs to be defined first. Suppose Frame
is the world inertial frame and
, where
is the identity matrix,
During the real-time localisation algorithm execution,
as shown in Equation (
1) is available once any
i-th AprilTag is in the field of view of the UUV’s camera. Therefore, finally, the UUV’s pose with respect to Frame
can be computed in real-time as follows using Equation (
4).
where
From Equation (
5),
and
can be obtained. Subsequently, quaternions and Euler angles of the UUV’s orientation with respect to Frame
can be computed from
[
34]. The position vector describing the origin of Frame
:
with respect to Frame
can be directly obtained from
.
The detailed step-by-step process of AprilTags installation to localisation is presented above. Another plausible question from an implementation perspective is how to determine which AprilTag should be used when multiple ones are within the UUV’s camera’s field of view. As the AprilTag detection is heavily affected by the camera’s distance from the AprilTag and its viewing angle [
35,
36], the norm distance
and orientation
are used to select the most reliable AprilTag transformation
among them, as shown below.
where
Finally, using Equations (
5) and (
6), the UUV localisation or the estimation of the UUV’s pose with respect to Frame
can be carried out solely using the UUV’s onboard camera.
3.4. The Overall Deployment Procedure
The overall deployment procedure for the UUV localisation is illustrated by
Figure 9. After installing the AprilTags in the desired poses as shown in
Figure 10,
as shown in Equation (
3) is extracted by moving the camera as demonstrated in
Figure 8 for
registers, and then subsequently, the poses (position and quaternion) are stored in a YAML file. These extractions are performed in a ROS node and are live-updated in RViz (a ROS visualisation tool) as shown in
Figure 11. Once the extractions and data-logging are completed, the relevant ROS node is shut down.
Using the previously generated YAML file, the extracted AprilTags poses with respect to Frame
as shown in Equation (
4) are published by another ROS node. In the same ROS node, the UUV localisation is performed using Equation (
5) and the UUV’s pose is published for real-time visualisation in RViz.
As summarised above, the overall deployment procedure is specifically designed to be simple, cost-effective, and time-efficient. Among all deployment steps, installing AprilTags is the only one that is relatively time-consuming. However, compared with the conventional AprilTags localisation approach, the proposed method requires no physical measurements, resulting in significant time savings. Moreover, AprilTags extraction and data-logging take only about 2 min. After this step, localisation can be executed immediately. The following section describes the experimental setup in detail.
4. Experimental Setup
For the experiments, a single ZED 2i camera and 10 AprilTags, installed inside the
m
3 workspace, are used. The AprilTags are installed on both horizontal and vertical surfaces, and under various lighting conditions, as shown in
Figure 10. The printed AprilTags are matt-laminated, effectively minimising the glare even under the direct ceiling lighting.
The specifications of the camera, AprilTags, workspace, ROS 2, and laptop are reported in
Table 2. The camera intrinsic parameters are provided by the software development kit (SDK) of the ZED 2i camera from Stereolabs (San Francisco, New York, USA and Paris, France). For the AprilTag detection, the library ‘lib-dt-apriltags’, a Python binding for the AprilTags 3 library developed by AprilRobotics, is called in the ROS nodes, using Python 3.10.12. The dimensions of the workspace are produced using the homogeneous transformation matrices. To gauge the relative position-estimation error, the ground-truth measurements between the relative AprilTags were taken as shown in
Table 3.
It is important to note that the current experiment is not conducted underwater due to the limited access to a large underwater environment. However, the overall deployment procedure remains the same for the underwater deployment. Therefore, this research work provides the initial phase of the full deployment procedure and algorithm development towards the actual underwater deployment in aquaculture infrastructure in the near future.
5. Results and Discussion
The overall deployment procedure for the localisation is carried out as shown in
Figure 9. As presented in
Section 3, the results of the three main deployment steps, namely (1) AprilTags installation, (2) AprilTags extraction and data-logging, and (3) AprilTags pose publishing and localisation, will be discussed in this section. The recorded videos of the experiments for AprilTags extraction and data-logging and AprilTags pose publishing and localisation are available at this hyperlink:
https://youtube.com/playlist?list=PLG3nO3TEqwOmjvLVynK54pUugt_U1ChFA&si=tKaJysXCAkjKN2rn (accessed on 18 November 2025).
5.1. AprilTags Installation
Due to the application-oriented nature, the camera and AprilTag specifications reported in
Table 2 play a crucial role in the AprilTags installation step. To accommodate future deployment in a large aquaculture environment, the tag family: tag36h11 is chosen, as it features 587 tags with unique IDs (0–586). In this experiment, 10 tags (IDs: 0–9) are utilised, and each tag size is
m, printed on standard A4 paper. Based on the preliminary test, it was found that the AprilTag detection using a ZED 2i camera provides the estimated pose of the AprilTag with the aforementioned tag size up to the working distance of 3 m.
Although the ZED 2i camera provides stereo vision, it is treated as a monocular camera, and the visual feedback from the left lens is used for the AprilTag detection. During the preliminary test of the AprilTag detection, it was found that a higher FPS with lower resolution is more reliable than a lower FPS with higher resolution in the case of fast motion. Therefore, the ZED 2i camera is configured to provide 60 FPS with 720 p resolution. As the recalibration of the ZED 2i camera is not recommended by the manufacturer unless necessary, the default calibration parameters are acquired via its SDK. Note: It is worth noting that, for the underwater deployment in the future, the recalibration of the ZED 2i camera is essential.
After all the aforementioned preliminary tests, 10 AprilTags are installed in the horizontal and vertical surfaces in the laboratory workspace of
m [W] ×
m [L] ×
m [H] under different lighting conditions as shown in
Figure 10. Compared to the conventional approach, the proposed method does not require any physical measurements between the relative poses of all 10 AprilTags, so the AprilTags installation time is substantially reduced. The only requirement is that each pair of AprilTags must be visible to the ZED 2i camera view so that the Step (2) AprilTags extraction and data-logging can be carried out properly, as shown in
Figure 8.
5.2. AprilTags Extraction and Data-Logging
Once the AprilTags installation is completed, the AprilTags extraction and data-logging can be carried out by a single person carrying a laptop and a ZED 2i camera, following the procedure illustrated in
Figure 8. The results of the extracted and data-logged poses of AprilTags, installed inside the
m
3 workspace, are reported in
Figure 11. The recorded video on the real-time update of the extracted and data-logged poses (frames) of the AprilTags is available at this hyperlink:
https://youtube.com/playlist?list=PLG3nO3TEqwOmjvLVynK54pUugt_U1ChFA&si=tKaJysXCAkjKN2rn (accessed on 18 November 2025).
During the AprilTags extraction test, it was found that AprilTag detection on the first available frame is adversely affected by motion blur and unstable image processing conditions. Therefore, the initially developed automated extraction procedure in ROS is not used in this work, and the registration of each AprilTag pair is performed only after user confirmation is provided by pressing the ’Enter’ key. At the end of registering all 10 AprilTags, the data-logging process is software-automated to save all poses into a YAML file. The entire process of extracting and data-logging the poses of all AprilTags installed inside the m3 workspace takes approximately 2 min to complete.
The ground-truth poses of 10 AprilTags, installed inside the
m
3 workspace, are not readily available but to gauge the relative position-estimation error % as shown in Equation (
7), the ground-truth manual measurements of the position between Tag ID
i and Tag ID
were taken and compared with the values obtained from the AprilTags extraction. The results are reported in
Table 3. Therefore, the results show that the relative position-estimation with centimetre-level accuracy (min. error: 0.03744 cm or min. error %: 0.01% and max. error: 8.42708 cm or max. error %: 4.28%) can be achieved via the proposed AprilTags extraction. It is essential to note that inaccuracies in manual measurements are unavoidable, which is the primary reason why
exhibits the largest error, propagated through the subsequent transformations
using AprilTags extraction when compared with the direct manual measurement between Tag ID 9 and Tag ID 0. However, the relative orientation (roll, pitch, yaw) cannot be measured manually without the help of an advanced and expensive 3D localisation system. Therefore, no quantitative orientation-estimation errors are reported except
Figure 10 and
Figure 11 to observe the AprilTags frames qualitatively.
where
is the position norm using the translation part of
resulting from AprilTags extraction, and
is the position norm resulting from the manual measurement.
Note: For underwater deployment, the same procedure for AprilTags extraction and data-logging can be carried out via remotely controlled operation on a UUV equipped with a ROS-installed Ubuntu machine and a ZED 2i camera (or any underwater camera with calibrated intrinsic parameters).
5.3. AprilTags Pose Publishing and Localisation
As shown in
Figure 9, provided that the poses of the AprilTags remain unchanged after their initial registration, the same YAML file can be used to publish the static homogeneous transformations among the AprilTags. The results of publishing the poses of AprilTags, installed inside the
m
3 workspace, and localisation using those AprilTags are reported in
Figure 12. The recorded video on the real-time update of localisation using AprilTags is available at this hyperlink:
https://youtube.com/playlist?list=PLG3nO3TEqwOmjvLVynK54pUugt_U1ChFA&si=tKaJysXCAkjKN2rn (accessed on 18 November 2025).
The full localisation path with its start and end points is illustrated in
Figure 13. Noise is also observed in the AprilTags detection output because no filtering or noise-reduction procedures have been incorporated into the current implementation. Due to the lack of a ground-truth path (e.g., a 3D motion capture system), the relative localisation error cannot be quantified directly. However, it can be inductively concluded from
Table 3 that the proposed localisation system can provide centimetre-level accuracy. Alternatively, based on the recorded video with real-time localisation updates, it can be qualitatively concluded that the proposed localisation method performs well with minimal noise or small detection error.
In summary, three main factors are validated in this work. Firstly, since no physical measurements are required between the relative poses of AprilTags, the installation can be completed with minimal setup time. Secondly, the proposed AprilTag-extraction algorithm requires the user to confirm each AprilTag pair by pressing the ’Enter’ key
times for
n AprilTags. Therefore, this second factor complements the first factor. Thirdly, the matt-laminated AprilTags can be used as a cost-effective localisation under different lighting conditions in the laboratory workspace. For underwater applications, the same deployment procedure shown in
Figure 9 can be carried out using the matt-laminated or transparent anti-fouling-coated AprilTags, along with a recalibrated underwater camera mounted on a UUV.