YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays

Dong, Xiaoyun; Yin, Xuyue; Zhen, Lubo; Jiang, Canyue; Zhan, Shun; Gu, Qiwen

doi:10.3390/app16104854

Open AccessArticle

YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays

by

Xiaoyun Dong

^1,2,

Xuyue Yin

^2,*

,

Lubo Zhen

³,

Canyue Jiang

³,

Shun Zhan

¹ and

Qiwen Gu

²

¹

School of Mechatronics and Automation, Shanghai University, Shanghai 200444, China

²

Shanghai Shipbuilding Technology Research Institute, Shanghai 200032, China

³

The Sino-British College, University of Shanghai for Science and Technology, Shanghai 200093, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(10), 4854; https://doi.org/10.3390/app16104854

Submission received: 25 March 2026 / Revised: 24 April 2026 / Accepted: 2 May 2026 / Published: 13 May 2026

Download

Browse Figures

Versions Notes

Abstract

Accurate interpretation of pointer-type gauges in submarine cabins is critical for operational safety but remains a laborious task due to confined spaces, disorganized visual backgrounds, and poor lighting conditions that contribute to crew eye fatigue. To address these challenges, this study presents an automated gauge reading approach that integrates a YOLOv11-based detection model with a dedicated value reading algorithm, deployed on an optical-see-through head-mounted display (HMD). The system first detects gauge regions of interest (ROIs) using a fine-tuned YOLOv11 model, followed by dial and pointer recognition via image processing techniques to compute measurement values, which are then overlaid on the HMD for operator confirmation and recording. Experimental evaluations conducted in a real submarine cabin environment demonstrate that the proposed YORO method significantly outperforms manual recording. Specifically, it reduces average task completion time by 92.5% (from 48.13 s to 3.58 s), decreases reading angular error by 77% (from 1.01° to 0.23°), and substantially lowers user workload, with a NASA-TLX score of 11.27 compared to 72.44 for the manual method (p < 0.001). These results validate the system’s effectiveness in enhancing efficiency, accuracy, and user experience. The proposed approach offers a practical framework for developing autonomous inspection systems in constrained industrial environments.

Keywords:

augmented reality; extended reality; YOLOv11; gauge reading; visual computing

1. Introduction

The continuous and accurate monitoring of system parameters, such as pressure gauges and flow meters, is essential for assessing the quality, safety, and operational longevity of expensive industrial equipment like armored vehicles, space stations, or ship cabins [1]. Unlike in open, well-illuminated environments such as factories or fields, gauge reading in constrained equipment space presents massive ergonomic challenges, such as motion sickness, narrow operation space, and weak illumination. Typically, gauge reading and recording in submarines involves all the above issues, where a crew member has to frequently crawl in narrow cabins to check and record all gauge meters to evaluate the equipment’s state. Furthermore, the inherent instability of a moving vessel combined with a large number of distributed gauges exacerbates the crew’s fatigue, which may introduce reading result confusion or recording error [2]. To address the above issue, up-to-date inspection human–machine interaction (HMI) approaches and visual computing techniques are reviewed to find out the YORO solution.

With respect to the HMI issue, traditional industrial equipment inspection requires operators to complete detailed paper checklists, which are labor intensive and error prone [3]. As human-centric industrial 5.0 modality surged, Augmented Reality (AR) technology has been extensively studied and applied to empower the labor force by improving the information delivery efficiency and ergonomic condition by projecting visual computed information in their eye-field [4,5]. Therefore, wearable, hands-free AR inspection approaches offer a particularly promising solution to the submarine cabin inspection with ergonomic and accessibility challenges [6,7].

While with respect to the accurate and fast visual computing issue, deep convolutional-based object image detection models, such as Faster R-CNN, Mask R-CNN, and the YOLO family, have been successfully applied to a variety of industrial inspection tasks [8,9], from detecting manufacturing defects [10] to verifying complex assemblies [11]. The task of real-time monitoring for numerous small, similar gauges in a submarine cabin resembles other complex visual detection problems. Examples include locating tiny pins in aviation connectors [12] and identifying multiple texture-less cable brackets against plain backgrounds [13]. These precedents demonstrate the problem’s difficulty and the necessity for a sophisticated deep learning solution.

This study introduces YORO (You Only Read Once), a novel real-time framework designed specifically for gauge monitoring in constrained environments such as submarine cabins. The system enables operators to read and record multiple instrument values at a single glance of the scene via a Head-Mounted Display (HMD). In contrast to existing automated gauge-reading pipelines that rely solely on deep learning regressors or heavy instance segmentation, YORO proposes a lightweight, hybrid architecture that integrates state-of-the-art (SOTA) object detection with robust geometric computer vision techniques to ensure both speed and interpretability in resource-limited edge devices. In addition, YORO establishes a framework that bridges the gap between SOTA object detection and its immediate, spatially anchored visualization within the operator’s field of view. This facilitates a “perceive-and-record” workflow in a single glance, thereby mitigating the risk of operator distraction and situational awareness loss inherent in switching between the physical gauge and a handheld recording device.

The remainder of the paper is structured as follows: Section 2 reviews related work. Section 3 and Section 4 detail the system methodology and implementation, including gauge detection by the Yolov11 model and gauge reading through range transformation. Section 5 presents experimental results and metrics, and Section 6 concludes the study. The workflow of the proposed method is illustrated in Figure 1.

Figure 1. The architecture of the proposed YORO system.

2. Related Work

2.1. Object Detection in Weak Lighting Conditions

A common and intuitive pipeline to low-light object detection is a two-stage pipeline: first, image enhancement is applied on the input image to improve its feature quality, then the preprocessed image is fed into a pre-trained off-the-shelf object detector.

This pipeline can leverage a wide array of Low-Light Image Enhancement methods. Traditional techniques include histogram-based methods like histogram equalization [14] and physics-based models such as the Retinex theory [15]. More recently, deep learning has become the dominant trend in LLIE (Low-Light Image Enhancement), with methods demonstrating superior performance over machine vision approaches [16]. These learning-based methods often utilize architectures such as Generative Adversarial Networks, like EnlightenGAN, which can be trained on unpaired data [17], or encoder–decoder networks like GladNet [18] and the Two-Branch Exposure-Fusion Network [19]. Many frameworks also incorporate dedicated denoising modules, such as SUNet [20], as a critical preprocessing step to mitigate noise before or during the enhancement process.

Despite its conceptual simplicity, the sequential paradigm suffers from several critical limitations. First, prepending a separate enhancement network adds a significant computational burden, increasing latency and making the pipeline less suitable for real-time applications, especially on resource-constrained devices like HMDs. Second, a fundamental disconnect exists between what is optimal for human versus machine perception. LLIE methods are trained to produce results that are visually pleasing to humans, but this does not guarantee better performance for downstream tasks like object detection. Studies have shown that enhancement can sometimes harm detection by introducing misleading textures or amplifying noise [21,22]. This reveals that what appears “better” to a human eye is not necessarily “better” for a detection algorithm. Finally, because the enhancement and detection modules are optimized independently, errors and artifacts are introduced by the detector, leading to a cascade of failures.

2.2. Gauge Value Reading by Computer Vision Techniques

Traditional automated gauge reading relies on a multi-step pipeline built on classical image processing and computer vision techniques. This pipeline is deterministic and geometry-driven, typically involving localizing the gauge, identifying its key components, and calculating a reading based on their geometric relationships [23].

The initial and most critical step is localizing the gauge and finding its center. A variety of methods have been employed for this task, including using the circular bezel of the gauge for circle-fitting via the Hough Transform, segmenting the gauge based on a specific background color using a region-growing scheme, or identifying the intersection point of extended lines from the scale marks [24]. Once the gauge is localized, the pointer is typically detected as a straight line, again often using the Hough Transform. The numerical values on the gauge face are extracted using Optical Character Recognition engines [25]. Finally, with the gauge center, pointer angle, and scale values known, the final reading is calculated using linear interpolation between the known scale mark values and their corresponding angles [26].

While these classical methods can be effective in highly controlled and predictable conditions, they are often brittle and lack robustness in real-world scenarios. Their performance is highly sensitive to variations in lighting, camera perspective, glare, and shadows [27]. Furthermore, they often require manual parameter tuning for different types of gauges, making them difficult to scale and generalize across the diverse range of instruments found in industrial settings. Their performance degrades significantly in the unstructured and variable environments characteristic of a submarine cabin [27].

2.3. Performance of Computer Vision Pipelines on AR HMD Devices

Deploying a visual computing algorithm on a portable edge computing device introduces critical performance constraints due to the limited computational and power resources. Successfully implementing a real-time computer vision model in this context requires balancing the fundamental trade-off between model accuracy, inference latency, and power consumption. High latency can disrupt the augmented reality experience and cause user discomfort, while excessive power consumption limits operational duration and may lead to thermal throttling. These factors are essential to the system’s usability and safety.

Several architectural paradigms address these computational challenges. A cloud- or edge-based approach leverages powerful models but introduces prohibitive network latency, rendering it unsuitable for disconnected environments like submarines [28]. Conversely, fully on-device inference on an HMD ensures low latency and network independence but is often computationally infeasible for state-of-the-art models. A hybrid architecture has thus emerged as a pragmatic compromise. This method pairs periodic, high-cost object detection with a lightweight, on-device tracker that provides high-frequency positional updates between detections, effectively mitigating latency and sustaining a fluid user experience.

To make on-device processing viable, models require rigorous optimization. This begins with selecting inherently efficient, mobile-optimized architectures, followed by model compression techniques like quantization (reducing numerical precision) and pruning (removing redundant parameters) to curtail computational and memory demands [29]. The standard deployment pipeline involves training a model, converting it to an efficient format like the Open Neural Network Exchange (ONNX) [30], and executing it via a hardware-accelerated runtime.

Empirical evidence underscores these constraints. On a Microsoft HoloLens 2 HMD, even highly optimized models like YOLOv5n achieve only 2–3 frames per second (300–475 ms inference) without hardware acceleration [31]. This performance is insufficient for interactive augmented reality, reinforcing the necessity of a holistic approach that integrates optimized models with intelligent processing pipelines.

3. Gauge Detection by YOLOv11

This section presents how to train and detect target gauge ROIs from images via a YOLOv11 image recognition model.

3.1. Submarine Gauge Reading Dataset

Due to the performance of any image detection network model being inherently constrained by the quality and representativeness of its training data, we captured and built a gauge inspection dataset on a real submarine. A total of 1050 photos were captured with a 1920 × 1080 image resolution, with the dataset comprising gauges collected from various perspectives, including lateral and frontal views, as depicted in Figure 2. The dataset’s variety was improved by modifying the exposure parameters to replicate low-lighting conditions in an actual submarine cabin. These images illustrate actual situations seen by the operators when operating in the submarine cabin, offering a thorough and representative training dataset for the development of the neural network.

Figure 2. Sample images of the self-constructed dataset: (a) frontal view, (b,c) lateral view.

Furthermore, 406 images from publicly accessible online platforms [32], which contain a wide variety of gauges, were added to the dataset, as shown in Figure 3. This diversity broadens the scope of the dataset and enhances the robustness of the model as well as its generalization performance. The dataset includes diverse real-world variations in lighting conditions, camera perspectives, and pointer positions, which helps improve model generalizability. The target object resolution of the dataset is 640 × 640, which matches our self-captured dataset, and the images are cropped, pre-processed, and standardized following Roboflow’s typical pipeline.

A comprehensive image dataset was manually annotated to serve as the foundation for neural network training. This dataset was partitioned into training, validation, and testing subsets, following a standard 80%, 10%, and 10% distribution, respectively. This partitioning strategy ensures robust model development, mitigates overfitting through tuning on the validation set, and provides an unbiased evaluation of generalization performance on the unseen test set. The annotation process was carried out using a general image labeling tool, X-Anylabeling [33], that supports YOLO-compatible export formats. For each image, bounding boxes were drawn around the target objects, and the corresponding annotation files were generated accordingly.

Figure 3. Partial gauge images from the Roboflow dataset platform [32].

To enhance the model’s robustness and its ability to generalize to novel conditions, the initial training dataset was substantially expanded through data augmentation, as shown in Figure 4. The augmentation strategy targeted the specific challenges of the application environment by employing both photometric and geometric transformations. Photometric augmentations, including adjustments to brightness, exposure, and the addition of Gaussian noise, were applied to improve the model’s resilience to variations in lighting and image quality. Simultaneously, geometric transformations, such as rotation, shearing, and cropping, were utilized to strengthen the model’s invariance to changes in the scale, orientation, and position of the target gauges.

Figure 4. Sample images after data augmentation: (a) raw image; (b) brightness; (c) exposure; (d) Gaussian noise; (e) rotation; (f) shearing; (g) cropping.

3.2. Gauge Detection Model Training

The training process was designed to fine-tune the YOLOv11 model specifically for the task of gauge detection, leveraging established best practices to maximize performance and efficiency.

The selection of an appropriate object detection architecture is a careful trade-off among accuracy, inference speed, and deployment complexity. Compared with the preceding version, YOLOv11 implements several key modifications: it replaces the C2f (Cross-Stage Partial bottleneck with 2 convolutions) module in the neck with a C3k2 (Cross-Stage Partial with kernel size 2) block, yielding improvements in both speed and overall performance. Additionally, it integrates the C2PSA (Cross-Stage Partial with Spatial Attention) module into the attention mechanism [34], facilitating the accurate detection of smaller or partially obscured objects [35]. Together, these notable improvements enable YOLOv11 to focus on gauge-related features when facing challenges of unstable visual background and environmental impediments [36]. Figure 5 illustrates the architecture of YOLOv11.

Figure 5. Architecture of YOLOv11.

We fine-tuned a lightweight YOLOv11s backbone initialized from COCO-pretrained weights to accelerate convergence and improve generalization. The model was trained using a 640 × 640 input resolution, a batch size of 16, and the framework’s auto optimizer setting. The training process demonstrated consistent learning, as evidenced by the general downward trend of the validation box loss over the epochs, which indicates the model grew progressively better at localizing objects. Figure 6 illustrates its performance.

Figure 6. Validation box loss plotted against training epochs.

As illustrated by the key performance metrics in Figure 7, although coarse detection metrics saturated early—mAP50 (mean Average Precision at Intersection Over Union = 0.5) reached 0.995 by epoch 50 and precision peaked near 0.998 around epoch 53—the stricter mAP50–95 (mAP averaged over IOU thresholds from 0.50 to 0.95) metric continued to show marginal but consistent improvement, attaining its maximum at epoch 95. Notably, mAP50–95 first reached within 0.005 absolute of its final best at epoch 57, indicating that meaningful gains in localization accuracy persisted between epochs 57 and 95.

Figure 7. Key performance metrics across epochs.

To optimize the model’s fine-grained localization accuracy, we prolonged the training to 95 epochs, the point at which the validation mAP50-95 metric peaked. Although subsequent epochs yielded diminishing returns, this methodology guarantees the selection of the checkpoint with optimal performance, hence rationalizing the increased processing cost [37,38]. Figure 8 indicates the detection results of the YOLOv11 model.

Figure 8. Visualization of the model detecting results.

3.3. Gauge ROI Outputs Filter

Upon successful training, the YOLOv11s model functions as a high-speed gauge detector. The final step in this stage is to filter the model’s predictions to extract the ROI for the second-stage analysis. For each input frame, the model outputs a set of predictions, each consisting of a bounding box, a confidence score, and a class label.

These outputs are processed by first filtering detections that fall into the “gauge” category. Lightweight post-processing rules tailored to gauge geometry were applied to model outputs to reduce false detection and to prioritize high-quality detections for downstream reading modules. Specifically, these procedures are illustrated in Figure 9.

Figure 9. Gauge candidate filtering process.

Our hybrid approach combines deep learning and classical computer vision to optimize for data and computational efficiency. Unlike an end-to-end model that requires a prohibitively expensive and time-consuming dataset with fine-grained keypoint annotations, our system first uses a YOLOv11 model for robust localization with simple bounding boxes. It then extracts ROI and applies traditional, training-free geometric algorithms for precise measurement, effectively leveraging the strengths of both techniques.

4. Gauge Value Reading

Following the successful inspection of gauge ROI by the YOLOv11s model, a multi-stage pipeline is employed to perform the precise reading of each instrument. This section comprehensively delineates the entire workflow, including dial area detection, pin detection, pin value calculation, and result visualization with post-processing.

4.1. Dial Area Detection

The initial bounding boxes provided by the YOLOv11 detector are filtered according to their ratio, since most gauges are approximately circular or square. To achieve the precision required for the following analysis, the circular boundary of the dial must be exactly identified within this ROI, which is achieved through a comprehensive, multi-phase execution of the Hough Circle Transform.

First, as illustrated in Figure 10, the ROI undergoes a series of preprocessing steps to enhance features critical for circle detection. The image is converted to grayscale, and CLAHE (Contrast Limited Adaptive Histogram Equalization) is applied. CLAHE is particularly effective in the variable and often low-light conditions of a submarine cabin, as it improves local contrast and reveals edges that might otherwise be obscured. Subsequently, Median and Gaussian filtering are employed to diminish image noise, decreasing false circle detections.

Figure 10. The workflow of the preprocessing: (a) Grayscale; (b) CLAHE; (c) Median filtering; (d) Gaussian filtering.

Subsequently, our algorithm generates a comprehensive library of candidate circles by trying different types of parameters, a paradigm common in randomized approaches [39]. Edges near the image boundaries are discarded, and the image dimensions are employed to exclude abnormally sized circles [40]. Each candidate is then subjected to a rigorous validation process based on an “edge energy” evaluation [41]. For a given candidate circle, points are sampled along its circumference, and the gradient magnitude at these points is computed. The mean of these gradient magnitude functions as the edge energy score [41]. A high score indicates that the candidate circle is well-aligned with robust edges, a fundamental attribute of a physical gauge dial. Upon eliminating duplicates, the candidate with the highest edge energy score is designated as the definitive dial area. This validation approach ensures that the detected circle corresponds to the actual gauge boundary rather than other circular artifacts in the background [40,41].

4.2. Pinline Detection

Once the dial area was localized, the next step was to extract the pointer. Considering the variability in gauge face design, some features dark pointers on a light background, while others have light pointers on a dark background. Our approach contains a dial flip assist function, as shown in Figure 11. The mean pixel intensity is initially examined within the dial region. If the mean brightness falls below a predefined threshold, the system classifies it as a dark-faced dial and performs a color inversion on the ROI. This ensures that in the subsequent processing steps, the pointer is consistently lighter than its immediate background.

As represented in Figure 12, the normalized ROI is then converted to a binary image using an adaptive threshold [42,43]. To further refine this binary representation, morphological operations, specifically opening or closing are applied [43]. These operations effectively remove noise pixels and connect any minor breaks in the pointer’s structure that may have resulted from binarization. To focus the search, a mask is applied to exclude the central pivot area and the outer bezel of the dial.

Figure 11. The schematic diagram of dial flip:(a) raw dark-faced dial; (b) flipped dark-faced dial.

Figure 12. Images in process: (a,d): grayscale; (b,e) binarization; (c) opening; (f) closing.

After pinline detection, a Progressive Probabilistic Hough Transform is then utilized to detect all potential line segments within the masked binary image. Among the candidate lines, a ranking function was applied based on two criteria: (1) line length relative to the dial radius, and (2) proximity of the line to the dial center. The line with the highest score was selected as the most probable pointer. This scoring mechanism effectively filtered out spurious edges caused by dial markings or reflections.

4.3. Pinline Value Calculation

The detected pointer line segment is translated into a numerical gauge reading based on its geometric orientation. The initial step is to ascertain the pointer’s tip, which is defined as the terminal point of the line segment furthest distant from the dial’s center. A vector is subsequently formed from the center of the dial to this tip.

The angle of this vector is calculated using the two-argument arctangent function. This function can correctly resolve the angle into the proper quadrant, returning an unambiguous value in the range from −180° to 180°. This angle is then normalized to a standard range 0° to 360° for consistent mapping.

Finally, a linear mapping function transforms the calculated angle into the corresponding physical unit of the gauge. This transformation requires a one-time pre-calibration for each type of gauge to determine four parameters: the angle corresponding to the minimum scale value

θ_{m i n}

, the angle for the maximum scale value

θ_{m a x}

, the minimum scale value itself

V_{m i n}

, and the maximum scale value

V_{m a x}

. The final gauge reading

V

is then calculated using the following linear transformation Equation (1) with regard to the recognized pointer angle

θ

.

V = \frac{θ - θ_{m i n}}{θ_{m a x} - θ_{m i n}} \times (V_{m a x} - V_{m i n}) + V_{m i n}

(1)

Additionally, outlier filtering was implemented to ensure that the estimated values remained within the physically permissible range of the gauge. The pinline value calculation and visualization is presented in Figure 13a.

Figure 13. Visualization results of the gauge reader. (a) Algorithm run on a PC. (b) Algorithm run on a Hololens 2 HMD.

4.4. Result Visualization and Post Processing

For practical application and operator verification, the system generates a visualized output by overlaying key information onto the original video feed to help the crew inspector ensure if all gauges are read automatically. This augmented view includes the initial YOLOv11 bounding box, the precisely located circular dial and its center point, the identified pointer line, and the final calculated numerical value rendered as text above the gauge. This provides immediate and intuitive feedback to the submarine crew, as displayed in Figure 13b.

When processing continuous video streams, raw readings can exhibit minor fluctuations due to subtle frame-to-frame variations in lighting or camera angle. To ensure a stable and reliable output, a temporal post-processing step is implemented [44]. A moving median filter can effectively smooth out transient noise and outliers, presenting a stable, coherent reading to the operator, which is crucial for both usability and data reading accuracy. In our implementation, based on a frame rate of 25 fps, we set the window size to 7 frames, which provides a good balance between temporal smoothing and responsiveness.

5. Results and Analysis

5.1. System Prototype

The system prototype of the proposed YORO is shown in Figure 14. The system was designed with an edge-computing architecture to ensure low latency and operational independence. Each edge terminal was connected to a central control and logging station via a high-speed local network.

The core software framework comprised two integrated modules. The gauge detection algorithm utilized the YOLOv11 object detection model, optimized for deployment on edge devices, to robustly identify and localize various analog gauge types within the cluttered submarine background. The gauge reading method subsequently processes the cropped gauge image, employing techniques including the Hough line transform for precise pointer angle calculation and a predefined scaling function to extrapolate the final measurement value. All processed data, including raw images, detection results, and calculated readings, were timestamped and stored locally on the edge device, with key results transmitted to the central station.

Figure 14. System prototype of YORO.

5.2. Gauge Detection Metrics

The performance metrics of the gauge detector is evaluated by Recall, Precision, and F1 Score. These metrics are calculated based on TP, which are correct detections of a ground-truth; FP, which are incorrect detections; and FN, which are missed ground-truth instances [45].

Precision P and recall R are defined as in Equations (2) and (3) separately:

R = \frac{T P}{T P + F N}

(2)

P = \frac{T P}{T P + F P}

(3)

The F1 score is the harmonic mean of precision and recall as in Equation (4):

F 1 = \frac{2 P R}{P + R}

(4)

As depicted in Figure 15, the area under the Precision–Recall curve is 0.985. The curve demonstrates that the detector maintains very high precision up to a recall of approximately 0.995; beyond this point, precision drops sharply—first to ≈0.83 and then to ≈0.28 before falling to 0 at full recall. This behavior indicates the model produces highly reliable high-confidence detections, while lowering the confidence threshold to recover the last few missed ground-truths introduces a disproportionate number of false positives.

The F1-Confidence curve is presented in Figure 15, which plots the F1 score against varying confidence thresholds. The curve demonstrates strong performance, achieving a maximum F1 score of 0.98. This peak performance, indicating the best balance between precision and recall, is reached at an optimal confidence threshold of 0.792.

Figure 15. (a) Precision–Recall Curve; (b) F1-Confidence Curve.

5.3. Gauge Reading Performance

A quantitative evaluation of the proposed gauge reading algorithm was conducted to validate its performance and robustness under different operational scenarios. The assessment focused on two key performance metrics: Reading Success Rate (RSR), which quantifies the proportion of all gauges in a scene that are successfully detected by the algorithm, and Average Inference Time (AIT), which quantifies processing speed. The AIT is a particularly critical indicator for ensuring low latency in the target HMD application.

To assess robustness under varying illumination, the YORO system was evaluated on 50 randomly sampled images for each of two representative lighting conditions: low light and normal light. Due to severe specular reflection, extreme pointer occlusion, and very low-contrast pointer–dial combinations from certain camera perspectives, YORO failed to obtain the correct gauge ROI or calculate the correct pinline value. As shown in Figure 16 and Table 1, YORO achieved a reading success rate (RSR) of 92.7% in low light and 96.8% in normal light, confirming its high detection capability in both environments.

Table 1. Reading Success Rate (RSR) and Average Inference Time (AIT) of the YORO system under low and normal lighting conditions.

Lighting Condition	RSR (%)	AIT (ms)
Low light	92.7	83.3
Normal light	96.8	85.4

In terms of efficiency, YORO demonstrated strong real-time capabilities. The AIT was measured at 83.3 ms in low-light conditions and 85.4 ms in normal-light conditions. These results demonstrate YORO’s high operational efficiency and minimal processing overhead, ensuring a responsive interface and empowering the user with the capability for real-time gauge interaction and data acquisition.

Figure 16. Gauge reading performance different conditions: (a) normal light; (b) weak light.

5.4. Application Performance

The performance of the proposed system is evaluated primarily from an application-centric perspective, focusing on its efficacy in creating a seamless, interactive, and automated inspection workflow. Accordingly, experiments were carried out in a real submarine environment designed to replicate the challenges of an actual cabin. The key performance metrics are defined by the system’s real-time capability, interactive alignment verification, and efficiency in data management. A compact metrics summary table is shown in Table 2.

Table 2. Compact summary table of statistical data of task efficiency, reading accuracy, and workload evaluation.

Metric	MR Mean (SD)	MR 95% CI	YAR Mean (SD)	YAR 95% CI	p-Value
Task Efficiency
Task Completion Time (s)	48.13 (4.90)	[45.42, 50.85]	3.58 (0.62)	[3.23, 3.93]	<0.001
Reading Accuracy
Angular Error	1.01 (0.19)	[0.91, 1.12]	0.23 (0.07)	[0.19, 0.27]	<0.001
NASA-TLX Workload
Mental Demand	69.27 (9.71)	[63.89, 74.65]	5.73 (4.11)	[3.46, 8.01]	<0.001
Physical Demand	80.33 (5.95)	[77.04, 83.63]	47.27 (3.56)	[45.30, 49.24]	<0.001
Temporal Demand	91.20 (2.24)	[89.96, 92.44]	0.00 (0.00)	[0.00, 0.00]	<0.001
Performance	2.87 (2.80)	[1.32, 4.42]	0.27 (0.59)	[−0.06, 0.60]	<0.003
Effort	93.40 (6.22)	[89.96, 96.84]	14.33 (5.09)	[11.51, 17.15]	<0.001
Frustration	97.60 (1.96)	[96.52, 98.68]	0.00 (0.00)	[0.00, 0.00]	<0.001
Overall Workload	72.44 (2.44)	[71.09, 73.79]	11.27 (1.06)	[10.68, 11.85]	<0.001

5.4.1. Experimental Setup

The experiments were conducted within the cramped confines of an authentic submarine cabin measuring approximately 2 × 1.6 × 1 m. As illustrated in Figure 17, a gauge panel containing 16 distinct industrial instruments was used as the target. The experimental submarine environment typically features a narrow operational space and dim lighting conditions. This setup ensured that the evaluation accounted for both spatial constraints and poor visibility, factors that typically hinder human visual performance.

Figure 17. Target gauge panel in the submarine cabin.

5.4.2. Participants and Task Design

Fifteen participants with backgrounds in engineering were recruited for the system evaluation experiment. The evaluation followed a within-subjects design, where each participant performed two tasks in a randomized order:

Manual Recording (MR): Participants identified and recorded the readings of the 16 gauges using only their eyes and a physical logbook under the dim lighting conditions.

YORO-Aided Recording (YAR): Participants utilized the HMD device equipped with YORO to automatically detect and digitize the gauge readings. As illustrated in Figure 18, which displays both the external posture of the tester wearing an HMD device and the actual interface seen through the HoloLens 2, YORO integrates seamlessly with the reality.

Figure 18. YORO gauge reading experiment: (a) third-person perspective of experiment; (b) first-person perspective of experiment.

To minimize experimental bias, the order in which participants performed the manual recognition and system-aided tasks was counterbalanced. All participants were asked to complete the National Aeronautics and Space Administration Task Load Index (NASA-TLX) [46] questionnaires following the experiment, and the results were utilized to assess the psychological exhaustion of the participants.

5.4.3. Task Efficiency Analysis

When deployed on the optical-see-through HMD, the system demonstrates a highly responsive performance that is crucial for a natural user experience. The entire process—from capturing the gauge image, detecting the dial, recognizing the pointer, to computing the value and rendering the visualization—is completed with minimal latency. In the manual recording (MR) mode, participants took an average of 48.13 s to complete the recording. This was largely due to the challenge of focusing on small dials in low-light and the physical strain of moving within the constrained space. In contrast, the YORO-aided recording (YAR) method achieved an average completion time of 3.58 s, representing a 92.5% improvement in operational efficiency. The visualization of experimental results is shown in Figure 19.

Figure 19. Average task completion time of MR and YAR.

5.4.4. Reading Accuracy Analysis

In the accuracy analysis of gauge reading values, since each distinct gauge has a different value range and unit, the average angular error of the pointer was used to represent the reading error, providing an indirect evaluation of reading accuracy. For manual recording (MR), the average reading angular error across the 16 gauges was 1.01. Reading errors occurred frequently, primarily due to difficulty in discerning gauge dials under low-contrast conditions or eye strain caused by prolonged exposure to poorly illuminated work environments. In contrast, as shown in Figure 20, the YORO-aided recording (YAR) achieved a low average error of 0.23, representing a 77% improvement compared to the MR mode.

Figure 20. Recognition accuracy comparison.

5.4.5. Subjective Workload Evaluation

As shown in Figure 21, the NASA-TLX results indicate a consistent score reduction for the YAR condition compared with the MR condition in all dimensions. Specifically, lower scores were observed in Mental Demand, Time Demand, Performance-related workload, and Frustration, suggesting that the proposed system reduced cognitive burden and perceived operational strain during gauge-reading tasks. A notable decrease was also observed in Physical Demand, indicating that the assisted workflow enables testers to read more comfortably.

Figure 21. NASA-TLX score comparison.

A comprehensive evaluation was conducted to compare the workload imposed by the MR and YAR methods. The analysis revealed that the overall workload, as measured by the NASA-TLX, was substantially lower for the YAR method. The mean overall workload score for the MR method was 72.44, which was significantly higher than the 11.27 recorded for the YAR method, a difference that was highly statistically significant (p < 0.001). This finding provides strong evidence that the YAR method effectively reduces user operational workload.

An examination of the individual subscales offers further insight into these differences. As shown in Figure 21, in terms of mental demand, the YAR method scored 5.73, a value far below the 69.27 of the MR method. This marked reduction is attributed to the YAR method’s use of computer vision for automatic recognition, which minimizes cognitive requirements and results in a more intuitive operational process. A significant reduction was also observed in physical demand, with YAR scoring 47.27 compared to 80.33 for MR. The lower physical workload of YAR stems from the elimination of manual, paper-based gauge value recording and the need to repeatedly look down at notes and up at gauges; instead, users wearing a head-mounted display (HMD) need only gaze at a designated area for system recognition and then confirm the result, leading to more natural interaction. The disparity was most pronounced for temporal demand, where YAR received a score of 0, in stark contrast to MR’s high score of 91.2. This suggests that by reading multiple instruments simultaneously, the YAR method imposes no time pressure, affording users enough time to complete the task. Regarding self-rated performance, both methods yielded low scores, which indicates good perceived performance. However, YAR (0.27) was still superior to MR (2.87). This result may be explained by users placing greater trust in the machine-calculated results from the YAR method, whereas some users expressed doubt about the correctness of their own manual readings and recordings with the MR method. The level of effort required was also drastically different, with YAR scoring 14.33 versus MR’s 93.4, indicating that users of the YAR method could accomplish the task with substantially less exertion. Finally, in terms of frustration, the YAR method scored 0, while the MR method scored a remarkably high 97.6. This indicates that the YAR method, with its streamlined process, induces almost no user frustration, whereas the repetitive nature of the manual operations in the MR method is a source of boredom and annoyance.

In conclusion, the YAR method demonstrates a significant advantage over the MR method across all dimensions of the NASA-TLX. It is associated with an extremely low perceived workload, characterized by the absence of time pressure and frustration, alongside superior self-rated performance. Conversely, the MR method imposes a high workload, requiring substantial effort and resulting in considerable user frustration. These findings collectively indicate that the YAR method offers a markedly superior user experience and is thus better suited for inspection and recording application.

Above experiment and analysis have demonstrated that the core innovative feature of the YAR method is the shift from full automation to human-in-the-loop verification. The system does not merely present a final number; it visually renders a synthesized pointer graphic directly onto the real-world gauge. This interactive step is intuitive and requires minimal training. It leverages humans’ pattern recognition for a final quality check, ensuring that any potential misalignment due to extreme occlusion, dirt, or rare model errors is caught immediately. This design guarantees that the recorded value is a faithful representation of the physical reality.

Upon the user’s confirmation of correct pointer alignment, the system automates the subsequent administrative steps. The validated reading value, along with a timestamp and gauge ID, is automatically packaged and transmitted to a cloud-based database in real time. This functionality completely removes the error-prone and tedious steps of manual data transcription, thereby eliminating a significant source of human error and freeing the crew to focus on the higher-value task of situational assessment and anomaly detection.

6. Conclusions

In this paper, an automated gauge reading system has been proposed and validated to assist operators in submarine maintenance environments. The proposed YORO method integrates a YOLOv11-based object detection model for robust gauge localization with a dedicated image processing module. This module employs techniques such as the Hough Circle Transform and Hough Line Transform to accurately detect dial regions, identify pointers, and compute measurement values.

Experimental evaluations conducted in a real submarine cabin environment demonstrate that the system achieves high accuracy in both gauge detection and value reading. In terms of operational efficiency, the YORO method reduced the average task completion time from 48.13 s in the manual recording (MR) mode to 3.58 s, representing a 92.5% improvement. Regarding reading accuracy, the average angular error decreased from 1.01° in MR mode to 0.23° in YORO mode, a 77% improvement. Furthermore, a NASA-TLX-based workload assessment revealed that YORO significantly reduces user workload, with an overall workload score of 11.27 compared to 72.44 for MR mode (p < 0.001). These results indicate that YORO not only enhances operational efficiency and reading accuracy but also substantially alleviates user workload.

The findings of this study provide a practical and effective foundation for developing semi-autonomous inspection solutions in confined and critical maritime environments. Nevertheless, the current system requires a one-time pre-calibration for each gauge type (including parameters such as θ_min, θ_max, V_min, V_max), and the number of gauge variants evaluated in our experiments is limited. In real submarine settings where numerous gauge variants with potentially non-linear or non-standard scales may be present, the scalability of the proposed approach remains an open challenge. Future work will therefore focus on systematically handling a wide variety of gauge variants and non-linear scales, optimizing the system’s performance for HMDs, and extending its functionality to support a broader range of instrument types, thereby further improving operator situational awareness.

Author Contributions

Conceptualization, X.Y.; methodology, X.Y. and X.D.; software, X.D., L.Z. and S.Z.; validation, X.D., L.Z. and C.J.; formal analysis, L.Z.; investigation, X.D., S.Z. and C.J.; resources, Q.G.; data curation, Q.G.; writing—original draft preparation, X.D.; writing—review and editing, X.Y.; visualization, X.D.; supervision, X.Y.; project administration, X.Y.; funding acquisition, X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research receives no funding support.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the conclusion of this article will be made available by the authors on request.

Acknowledgments

During the preparation of this work, the author(s) used Gemini 3 for the purposes of reference formatting and grammar review. After using this tool, the author(s) thoroughly reviewed and edited the content as needed and take full responsibility for the final publication list.

Conflicts of Interest

The authors declare no conflicts.

Abbreviations

The following abbreviations are used in this manuscript:

YORO	You Only Read Once
YOLO	You Only Look Once
HMD	Head-Mounted Display
ROI	Region of Interest
SOTA	State of the Art
AR	Augmented Reality
LLIE	Low-Light Image Enhancement
ONNX	Open Neural Network Exchange
C2f	Cross-Stage Partial bottleneck with 2 convolutions
C3k2	Cross-Stage Partial with kernel size 2
mAp50	mean Average Precision at Intersection Over Union = 0.5
mAP50-95	mAP averaged over IOU thresholds from 0.50 to 0.95
CLAHE	Contrast Limited Adaptive Histogram Equalization
RSR	Reading Success Rate
AIT	Average Inference Time
MR	Manual Recording
YAR	YORO-Aided Recording

References

Lee, J.; Wu, F.; Zhao, W.; Ghaffari, M.; Liao, L.; Siegel, D. Prognostics and Health Management Design for Rotary Machinery Systems—Reviews, Methodology and Applications. Mech. Syst. Signal Process. 2014, 42, 314–334. [Google Scholar] [CrossRef]
Drury, C.G. Human Factors and Automation in Test and Inspection. In Handbook of Industrial Engineering; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2001; pp. 1887–1920. ISBN 978-0-470-17233-9. [Google Scholar]
Rožanec, J.M.; Zajec, P.; Trajkova, E.; Šircelj, B.; Brecelj, B.; Novalija, I.; Dam, P.; Fortuna, B.; Mladenić, D. Towards a Comprehensive Visual Quality Inspection for Industry 4.0*. IFAC-PapersOnLine 2022, 55, 690–695. [Google Scholar] [CrossRef]
Lu, Y. Industry 4.0: A Survey on Technologies, Applications and Open Research Issues. J. Ind. Inf. Integr. 2017, 6, 1–10. [Google Scholar] [CrossRef]
Egger, J.; Masood, T. Augmented Reality in Support of Intelligent Manufacturing—A Systematic Literature Review. Comput. Ind. Eng. 2020, 140, 106195. [Google Scholar] [CrossRef]
Ghita, O.M.; Grigorescu, S.D.; Andrei, H.; Calin, N. Solution for Inspection of Power Energy Equipment Using Augmented Reality. Sci. Bull. Electr. Eng. Fac. 2017, 17. [Google Scholar] [CrossRef]
Poggi, L.; Gaggero, T.; Gaiotti, M.; Ravina, E.; Rizzo, C.M. Recent Developments in Remote Inspections of Ship Structures. Int. J. Nav. Archit. Ocean Eng. 2020, 12, 881–891. [Google Scholar] [CrossRef]
Ameri, R.; Hsu, C.-C.; Band, S.S. A Systematic Review of Deep Learning Approaches for Surface Defect Detection in Industrial Applications. Eng. Appl. Artif. Intell. 2024, 130, 107717. [Google Scholar] [CrossRef]
Ma, Y.; Yin, J.; Huang, F.; Li, Q. Surface Defect Inspection of Industrial Products with Object Detection Deep Networks: A Systematic Review. Artif. Intell. Rev. 2024, 57, 333. [Google Scholar] [CrossRef]
Chen, Z.; Feng, X.; Liu, L.; Jia, Z. Surface defect detection of industrial components based on vision. Sci. Rep. 2023, 13, 22136. [Google Scholar] [CrossRef]
Mazzetto, M.; Teixeira, M.; Rodrigues, É.O.; Casanova, D. Deep Learning Models for Visual Inspection on Automotive Assembling Line. Int. J. Adv. Eng. Res. Sci. 2020, 7, 473–494. [Google Scholar] [CrossRef]
Mao, W.-L.; Wang, C.-C.; Chou, P.-H.; Liu, Y.-T. Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models. IEEE Sens. J. 2024, 24, 26877–26888. [Google Scholar] [CrossRef]
Wang, L.; Song, C.; Wan, G.; Cui, S. A Surface Defect Detection Method for Steel Pipe Based on Improved YOLO. Math. Biosci. Eng. 2024, 21, 3016–3036. [Google Scholar] [CrossRef]
Gonzalez, R.C. Digital Image Processing; Pearson Education: Karnataka, India, 2009; ISBN 81-317-2695-9. Available online: https://api.pageplace.de/preview/DT0400.9781292223070_A37747583/preview-9781292223070_A37747583.pdf (accessed on 13 January 2026).
Land, E.H.; McCann, J.J. Lightness and Retinex Theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef]
Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. Enlightengan: Deep Light Enhancement without Paired Supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef]
Wang, W.; Wei, C.; Yang, W.; Liu, J. Gladnet: Low-Light Enhancement Network with Global Awareness. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2018); IEEE: Piscataway, NJ, USA, 2018; pp. 751–755. [Google Scholar] [CrossRef]
Cai, J.; Gu, S.; Zhang, L. Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images. IEEE Trans. Image Process. 2018, 27, 2049–2062. [Google Scholar] [CrossRef] [PubMed]
Yang, W.; Wang, S.; Fang, Y.; Wang, Y.; Liu, J. From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2020; pp. 3063–3072. [Google Scholar] [CrossRef]
Loh, Y.P.; Chan, C.S. Getting to Know Low-Light Images with the Exclusively Dark Dataset. Comput. Vis. Image Underst. 2019, 178, 30–42. [Google Scholar] [CrossRef]
Liu, J.; Xu, D.; Yang, W.; Fan, M.; Huang, H. Benchmarking Low-Light Image Enhancement and Beyond. Int. J. Comput. Vis. 2021, 129, 1153–1184. [Google Scholar] [CrossRef]
Li, B.; Yang, J.; Zeng, X.; Yue, H.; Xiang, W. Automatic Gauge Detection via Geometric Fitting for Safety Inspection. IEEE Access 2019, 7, 87042–87048. [Google Scholar] [CrossRef]
Sablatnig, R.; Kropatsch, W.G. Automatic Reading of Analog Display Instruments. In Proceedings of the 12th International Conference on Pattern Recognition; IEEE: Piscataway, NJ, USA, 1994; Volume 1, pp. 794–797. [Google Scholar] [CrossRef]
Zhou, D.; Yang, Y.; Zhu, J.; Wang, K. Intelligent Reading Recognition Method of a Pointer Meter Based on Deep Learning in a Real Environment. Meas. Sci. Technol. 2022, 33, 055021. [Google Scholar] [CrossRef]
Li, Z.; Zhou, Y.; Sheng, Q.; Chen, K.; Huang, J. A High-Robust Automatic Reading Algorithm of Pointer Meters Based on Text Detection. Sensors 2020, 20, 5946. [Google Scholar] [CrossRef]
Jiao, W.; Zhao, D.; Mei, X.; Yang, S.; Zhang, X.; Li, C.; Li, L. Multiresolution Deep Feature Learning for Pointer Meters Reading Recognition. J. Manuf. Process. 2024, 114, 168–177. [Google Scholar] [CrossRef]
Liu, M.; Ding, X.; Du, W. Continuous, Real-Time Object Detection on Mobile Devices without Offloading. In Proceedings of the 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS); IEEE: Piscataway, NJ, USA, 2020; pp. 976–986. [Google Scholar] [CrossRef]
Cheng, Y.; Wang, D.; Zhou, P.; Zhang, T. Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges. IEEE Signal Process. Mag. 2018, 35, 126–136. [Google Scholar] [CrossRef]
Choudhary, A. Cross-Platform AI Model Optimization and Deployment with ONNX. Available online: https://johal.in/cross-platform-ai-model-optimization-and-deployment-with-onnx/ (accessed on 2 March 2026).
Awadallah, O.; Sadhu, A. Automated Multiclass Structural Damage Detection and Quantification Using Augmented Reality. J. Infrastruct. Intell. Resil. 2023, 2, 100024. [Google Scholar] [CrossRef]
Ciaglia, F.; Zuppichini, F.S.; Guerrie, P.; McQuade, M.; Solawetz, J. Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark. arXiv 2022, arXiv:2211.13523. [Google Scholar] [CrossRef]
Wang, W. X-AnyLabeling: Effortless Data Labeling with AI Support from Segment Anything and Other Awesome Models. (Version 2.3.3). 2023. Available online: https://github.com/CVHub520/X-AnyLabeling (accessed on 24 April 2026).
Khanam, R.; Hussain, M. Yolov11: An Overview of the Key Architectural Enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
Ghahremani, A.; Adams, S.D.; Norton, M.; Khoo, S.Y.; Kouzani, A.Z. Detecting Defects in Solar Panels Using the YOLO V10 and V11 Algorithms. Electronics 2025, 14, 344. [Google Scholar] [CrossRef]
Mao, M.; Hong, M. YOLO Object Detection for Real-Time Fabric Defect Inspection in the Textile Industry: A Review of YOLOv1 to YOLOv11. Sensors 2025, 25, 2270. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 779–788. [Google Scholar]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft Coco: Common Objects in Context. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
Huang, J.; Wang, J.; Tan, Y.; Wu, D.; Cao, Y. An Automatic Analog Instrument Reading System Using Computer Vision and Inspection Robot. IEEE Trans. Instrum. Meas. 2020, 69, 6322–6335. [Google Scholar] [CrossRef]
Hung, M.-H.; Hsieh, C.-H. Automatic Pointer Meter Reading Based on Machine Vision. In Proceedings of the 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC); IEEE: Piscataway, NJ, USA, 2019; pp. 32–35. [Google Scholar] [CrossRef]
Salomon, G.; Laroca, R.; Menotti, D. Image-Based Automatic Dial Meter Reading in Unconstrained Scenarios. Measurement 2022, 204, 112025. [Google Scholar] [CrossRef]
Sauvola, J.; Pietikäinen, M. Adaptive Document Image Binarization. Pattern Recognit. 2000, 33, 225–236. [Google Scholar] [CrossRef]
Zheng, C.; Wang, S.; Zhang, Y.; Zhang, P.; Zhao, Y. A Robust and Automatic Recognition System of Analog Instruments in Power System by Using Computer Vision. Measurement 2016, 92, 413–420. [Google Scholar] [CrossRef]
Deng, G.; Huang, T.; Lin, B.; Liu, H.; Yang, R.; Jing, W. Automatic Meter Reading from UAV Inspection Photos in the Substation by Combining YOLOv5s and DeepLabv3+. Sensors 2022, 22, 7090. [Google Scholar] [CrossRef] [PubMed]
Powers, D.M. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar] [CrossRef]
Hart, S.G.; Staveland, L.E. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Human Mental Workload; Hancock, P.A., Meshkati, N., Eds.; Advances in Psychology; North-Holland: Amsterdam, The Netherlands, 1988; Volume 52, pp. 139–183. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dong, X.; Yin, X.; Zhen, L.; Jiang, C.; Zhan, S.; Gu, Q. YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays. Appl. Sci. 2026, 16, 4854. https://doi.org/10.3390/app16104854

AMA Style

Dong X, Yin X, Zhen L, Jiang C, Zhan S, Gu Q. YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays. Applied Sciences. 2026; 16(10):4854. https://doi.org/10.3390/app16104854

Chicago/Turabian Style

Dong, Xiaoyun, Xuyue Yin, Lubo Zhen, Canyue Jiang, Shun Zhan, and Qiwen Gu. 2026. "YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays" Applied Sciences 16, no. 10: 4854. https://doi.org/10.3390/app16104854

APA Style

Dong, X., Yin, X., Zhen, L., Jiang, C., Zhan, S., & Gu, Q. (2026). YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays. Applied Sciences, 16(10), 4854. https://doi.org/10.3390/app16104854

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

YORO (You Only Read Once): Automated Gauge Reading in Submarine Cabins via Head-Mounted Displays

Abstract

1. Introduction

2. Related Work

2.1. Object Detection in Weak Lighting Conditions

2.2. Gauge Value Reading by Computer Vision Techniques

2.3. Performance of Computer Vision Pipelines on AR HMD Devices

3. Gauge Detection by YOLOv11

3.1. Submarine Gauge Reading Dataset

3.2. Gauge Detection Model Training

3.3. Gauge ROI Outputs Filter

4. Gauge Value Reading

4.1. Dial Area Detection

4.2. Pinline Detection

4.3. Pinline Value Calculation

4.4. Result Visualization and Post Processing

5. Results and Analysis

5.1. System Prototype

5.2. Gauge Detection Metrics

5.3. Gauge Reading Performance

5.4. Application Performance

5.4.1. Experimental Setup

5.4.2. Participants and Task Design

5.4.3. Task Efficiency Analysis

5.4.4. Reading Accuracy Analysis

5.4.5. Subjective Workload Evaluation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI