Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs
Highlights
- Resi-YOLO raises APsmall by 13.1 percentage points over YOLOv8n on the high-glare test split.
- On Jetson Orin Nano, the deployed pipeline runs at 12.8 FPS end-to-end, while TensorRT inference exceeds 30 FPS.
- Robust tiny-vessel perception can be executed onboard maritime UAVs without cloud dependence.
- Glare Severity Score (GSS)-stratified evaluation and the dual-brain design offer a practical blueprint for safety-oriented deployment under link variability.
Abstract
1. Introduction
1.1. Background and Motivation
1.2. Key Challenges
1.3. Contributions
- Core Detector Architecture: A YOLO11n-based detector is augmented with a P2 detection head, CBAM attention, and NWD loss to improve tiny-vessel recall and glare robustness under maritime sea clutter. Optional modules including SAHI slicing and RGF/BLSF geometric filtering are evaluated separately as deployment-specific extensions.
- MCU Safety-Island Dual-Brain Architecture: A heterogeneous perception–navigation pipeline is implemented where Jetson Orin Nano (NVIDIA, Santa Clara, CA, USA) performs deep perception while an MCU safety island maintains deterministic tracking and minimal navigation cues (e.g., IMM-UKF state estimation and lightweight planning). This design explicitly targets marine engineering constraints such as intermittent links, abrupt exposure changes, and GPU workload spikes by decoupling high-throughput, non-deterministic vision processing from safety-critical control loops.
- System-Level Implementation: A waterproof UAV platform (Pixhawk-class autopilot + stabilized 4K gimbal camera), TensorRT deployment on Jetson Orin Nano, and a latency-aware data bus (e.g., RTSP/MQTT/WebSocket) for edge-to-ground integration are implemented.
- Evaluation Blueprint for Marine Operations: Stratified evaluation across glare severity (GSS), target-size bins, and video I/O impairments (delay/jitter/dropout) is carried out, accompanied by reproducibility templates, command/parameter logs, and deployment checklists.
2. Related Work
2.1. Vision-Based Maritime UAV Perception Under Sea Glare
2.2. Attention and Loss Mechanisms for Clutter Suppression
2.3. Reliability-Aware Perception and Geometric Filtering in Maritime Vision
2.4. Heterogeneous Architectures and Edge-Cloud Systems
3. Proposed Method: Enhanced Resi-YOLO with Dual-Brain Integration
3.1. Overview
Stress-Test Corpus Versus Benchmark-Scale Validation
3.2. P2 Detection Head and NWD Loss for Tiny Vessels
3.3. CBAM for Glare Suppression
3.4. Core Architecture and Optional Modules
3.5. SAHI for High-Resolution Inference
3.6. Reliability-Guided Fusion (RGF) and Binary Line Segment Filter (BLSF)
3.7. Sensitivity Analysis and Geometric Filtering
3.8. MCU Safety-Island RRT Planning and MAVLink Commanding
4. System Implementation: Maritime UAV–Edge–Cloud Pipeline
4.1. Custom Waterproof UAV Platform
4.2. Dual-Brain Link Rate and Packet Definition
4.3. Power, Signal, and Time Synchronization
4.4. Low-Bandwidth Messaging
5. Experimental Protocol
5.1. Datasets and Splits
5.2. Training Recipe and Glare-Oriented Augmentations
5.3. Metrics: Accuracy, Robustness, and Efficiency
5.4. Baselines, Ablations, and Resolution Study
6. Results and Discussion
6.1. Tiny-Vessel Detection Performance
6.2. Glare Robustness with GSS-Stratified Evaluation
6.3. Dual-Brain Tracking Robustness Under Delay/Jitter
6.4. Embedded Feasibility on Jetson Orin Nano
6.4.1. Definition of Throughput and Latency Metrics
| Stage | Mean (ms) | p95 (ms) | Measurement Notes |
|---|---|---|---|
| Capture + encoding | 40 | 50 | On-camera ISP + H.265 encoder latency (SIYI A8 mini gimbal camera). |
| Network transfer | 5 | 10 | Gimbal-to-Jetson Ethernet streaming (wired LAN, negligible jitter). |
| Video decoding (NVDEC) | 15 | 25 | Hardware decoding via nvv4l2decoder (DeepStream optimized). |
| Preprocessing (VIC) | 8 | 12 | Resizing and color space conversion (NV12→RGBA) on VIC hardware. |
| Inference (TensorRT) | ~15 | ~20 | TensorRT FP16, batch = 1 (Resi-YOLO model). (INT8 could further reduce latency.) |
| Postprocessing (NMS) | 4 | 8 | NMS and formatting on CPU/GPU. |
| Publishing/serializing | 2 | 5 | JSON serialization and MQTT publishing overhead. |
| End-to-end (total) | ~90 | ~130 | Total pipeline latency (frame capture to alert); target < 200 ms for reliable teleoperation. |
| Model Variant | Parameters (M) | GFLOPS | FPS (Standard) | p95 Latency (ms) | Computational Stability (Sim) |
|---|---|---|---|---|---|
| YOLOv8n (Baseline) | 3.2 | 8.7 | ~27.0 | 45 | High |
| YOLO11n (Vanilla) | 2.6 | 6.5 | ~22.5 | 52 | High |
| YOLO11n + P2 | 3.4 | 10.5 | ~14.5 | 78 | Medium |
| YOLO11n + CBAM | 2.8 | 7.1 | 20.8 | 58 | Very High |
| Resi-YOLO (P2 + CBAM + NWD) | 3.5 | 11.2 | 12.8 | 85 | High |
| Metric | Standard Mode (15 W Max-P) | Super Mode (25 W MAXN) | Improvement | Physical Interpretation |
|---|---|---|---|---|
| Engine Throughput (FPS, TensorRT-only) | 30.2 | 55.4 | +83.4% | Significantly higher perception frequency |
| Average Power (W) | 12.1 | 18.2 | +50.4% | Increased power within acceptable |
| Energy per Frame (mJ/frame) | 400.6 | 328.5 | −18.0% | Lower energy cost per processed frame |
| Efficiency (FPS/W) | 2.50 | 3.04 | +21.6% | Higher compute utilization |
6.4.2. Embedded Computational Efficiency and Energy-Aware Analysis: Advantages of Super Mode
6.4.3. Energy Efficiency Analysis Under Jetson Orin Nano Super Mode
6.5. Discussion and Limitations
Advantages of the Proposed Method
7. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lee, Y.H.; Meng, Y.S. Near Sea-Surface Mobile Radiowave Propagation at 5 GHz. Radioengineering 2014, 23, 824–830. [Google Scholar]
- Kiefer, B.; Žust, L.; Kristan, M.; Perš, J.; Teršek, M.; Wiliem, A.; Messmer, M.; Yang, C.-Y.; Huang, H.-W.; Jiang, Z.; et al. 2nd Workshop on Maritime Computer Vision (MaCVi) 2024: Challenge Results. In Proceedings of the IEEE/CVF Winter Confer-ence on Applications of Computer Vision (WACV) Workshops, Waikoloa, HI, USA, 3–8 January 2024; pp. 869–891. [Google Scholar]
- Schulzrinne, H.; Rao, A.; Lanphier, R. RFC 2326; Real-Time Streaming Protocol (RTSP); RFC Editor: Marina del Rey, CA, USA, 1998. [Google Scholar] [CrossRef]
- Schulzrinne, H.; Casner, S.; Frederick, R.; Jacobson, V. RFC 3550; RTP: A Transport Protocol for Real-Time Applications; RFC Editor: Marina del Rey, CA, USA, 2003. [Google Scholar] [CrossRef]
- Gettys, J.; Nichols, K. Bufferbloat: Dark buffers in the Internet. Commun. ACM 2012, 55, 57–65. [Google Scholar] [CrossRef]
- Varga, L.A.; Kiefer, B.; Messmer, M.; Zell, A. SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022; pp. 2260–2270. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Zhao, X.; Liu, Q.; Li, M.; Li, J.; Zhang, Y.; Huang, Y.; Zhou, J.; Chen, C. YOLOv7-sea: A lightweight and accurate object de-tection model for maritime environments. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 2–7 January 2023; pp. 1–10. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, Y.; Liu, J.; Zhao, J.; Li, Z.; Yan, Y.; Yan, X.; Xu, F.; Li, F. LCSC-UAVNet: A High-Precision and Lightweight Model for Small-Object Identification and Detection in Maritime UAV Perspective. Drones 2025, 9, 100. [Google Scholar] [CrossRef]
- Qin, J.; Li, M.; Zhao, J.; Zhong, J.; Zhang, H. Revolutionize the Oceanic Drone RGB Imagery with Pioneering Sun Glint De-tection and Removal Techniques. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2024; pp. 8326–8335. [Google Scholar]
- Zhao, F.; Chen, Y.; Xi, D.; Liu, Y.; Wang, J. Enhanced hermit crabs detection using super-resolution reconstruction and im-proved YOLOv8 on UAV-captured imagery. Mar. Environ. Res. 2025, 210, 107313. [Google Scholar] [CrossRef]
- Rusyn, B.; Lutsyk, O.; Kosarevych, R.; Maksymyuk, T. Features extraction from multi-spectral remote sensing images based on multi-threshold binarization. Sci. Rep. 2023, 13, 19655. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Wang, Z.; Wang, X. Normalized Gaussian Wasserstein distance for tiny object detection. ISPRS J. Photogramm. Remote Sens. 2022, 190, 119–134. [Google Scholar]
- Akyon, F.C.; Altinuc, S.O.; Temizel, A. Slicing aided hyper inference and fine-tuning for small object detection. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 966–970. [Google Scholar]
- Li, L.; Zhang, Y.; Chen, H.; Wang, J.; Xu, K. Spotlight on Small-Scale Ship Detection: Empowering YOLO with Ad-vanced Techniques and a Novel Dataset. In Proceedings of the Asian Conference on Computer Vision (ACCV), Hanoi, Vietnam, 8–12 December 2024; pp. 1–15. [Google Scholar]
- Li, H.; Zhao, F.; Xue, F.; Wang, J. Succulent-YOLO: Smart UAV-assisted succulent farmland monitoring with CLIP-based YOLOv10 and Mamba computer vision. Remote Sens. 2025, 17, 2219. [Google Scholar] [CrossRef]
- Zhao, F.; Ren, Z.; Wang, J. Smart UAV-assisted rose growth monitoring with improved YOLOv10 and Mamba restoration techniques. Smart Agric. Technol. 2025, 10, 100730. [Google Scholar] [CrossRef]
- Tsai, S.-E.; Yang, S.-M.; Hsieh, C.-H. Real-Time Deterministic Lane Detection on CPU-Only Embedded Systems via Binary Line Segment Filtering. Electronics 2026, 15, 351. [Google Scholar] [CrossRef]
- Tsai, S.-E.; Hsieh, C.-H. A Real-Time Collision Warning System for Autonomous Vehicles Based on YOLOv8n and SGBM Stereo Vision. Electronics 2025, 14, 4275. [Google Scholar] [CrossRef]
- Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z.; Luo, P.; Liu, W.; Wang, X. ByteTrack: Multi-object tracking by associating every detection box. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 1–14. [Google Scholar]
- Aharon, N.; Orfaig, R.; Bobrovsky, B.-Z. BoT-SORT: Robust associations multi-pedestrian tracking. arXiv 2022, arXiv:2206.14651. [Google Scholar]
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Ciaparrone, G.; Sánchez, F.L.; Tabik, S.; Troiano, L.; Tagliaferri, R.; Herrera, F. Deep learning in video multi-object tracking: A survey. Neurocomputing 2020, 381, 61–88. [Google Scholar] [CrossRef]
- Milan, A.; Leal-Taixé, L.; Reid, I.; Roth, S.; Schindler, K. MOT16: A benchmark for multi-object tracking. arXiv 2016, arXiv:1603.00831. [Google Scholar] [CrossRef]
- Luiten, J.; Osep, A.; Dendorfer, P.; Torr, P.; Geiger, A.; Leal-Taixé, L.; Leibe, B. HOTA: A higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis. 2021, 129, 548–578. [Google Scholar] [CrossRef] [PubMed]
- Jocher, G.; Chaurasia, A.; Qiu, J. YOLOv8: Ultralytics Next-Generation Real-Time Object Detector. arXiv 2023, arXiv:2305.09972. [Google Scholar]
- Satore, J.L.; Jao, J.; Castilla, R.; Vallar, E.; Galvez, M.C. Comparative Study of YOLOv10, YOLO11 and YOLOv12 Lightweight Models for Multi-Class Maritime Search and Rescue Using UAV Imagery. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, XLVIII-1/W6, 199–204. [Google Scholar] [CrossRef]
- Ultralytics. YOLO11N Documentation. Ultralytics Official Documentation 2026. Available online: https://docs.ultralytics.com/models/yolo11/ (accessed on 16 March 2026).
- NVIDIA Corporation. Jetson Orin Nano Developer Kit Carrier Board Specification; SP-11324-001; NVIDIA: Santa Clara, CA, USA, 2024; Available online: https://developer.nvidia.com/downloads/assets/embedded/secure/jetson/orin_nano/docs/jetson_orin_nano_devkit_carrier_board_specification_sp.pdf (accessed on 1 February 2026).
- NVIDIA. Jetson Orin Nano Developer Kit User Guide. NVIDIA Developer. Available online: https://developer.nvidia.com/embedded/learn/jetson-orin-nano-devkit-user-guide/index.html (accessed on 30 January 2026).
- NVIDIA. Solving Entry-Level Edge AI Challenges with NVIDIA Jetson Orin Nano; NVIDIA Technical Blog; NVIDIA: Santa Clara, CA, USA, 2022; Available online: https://developer.nvidia.com/blog/solving-entry-level-edge-ai-challenges-with-nvidia-jetson-orin-nano/ (accessed on 30 January 2026).
- Bilous, N.; Malko, V.; Ahekian, I.; Korobiichuk, I.; Ivanichev, V. Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood. Appl. Syst. Innov. 2026, 9, 6. [Google Scholar] [CrossRef]
- Ultralytics. Ultralytics YOLO GitHub Repository. GitHub Repository. Available online: https://github.com/ultralytics/ultralytics (accessed on 31 January 2026).
- Bernardin, K.; Stiefelhagen, R. Evaluating multiple object tracking performance: The CLEAR MOT metrics. EURASIP J. Image Video Process. 2008, 2008, 246309. [Google Scholar] [CrossRef]
- Jocher, G.; Qiu, J.; Chaurasia, A. Ultralytics YOLO, v8.4.6; Zenodo: Geneva, Switzerland, 2026. [CrossRef]
- NVIDIA. Jetson Orin Nano Series Data Sheet; DS-11105-001; NVIDIA: Santa Clara, CA, USA, 2023; Available online: https://forums.developer.nvidia.com/uploads/short-url/mHytGSlaBUsKUAKOtHHjldblsX8.pdf (accessed on 30 January 2026).
- NVIDIA. Jetson Orin Nano Technical Specifications. NVIDIA Developer Documentation 2023. Available online: https://developer.nvidia.com/embedded/jetson-modules (accessed on 30 January 2026).
- NVIDIA. Jetson Orin Nano Super Developer Kit. Available online: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/ (accessed on 30 January 2026).
- Exploring NVIDIA Jetson Orin Nano Super Mode Performance Using Generative AI. Available online: https://www.ridgerun.com/post/exploring-nvidia-jetson-orin-nano-super-mode-performance-using-generative-ai (accessed on 30 January 2026).
- Yu, C.; Li, Y.; Zhang, Z.; Wang, X.; Liu, H. SMEP-DETR: Transformer-Based Ship Detection for SAR Imagery with Multi-Edge Enhancement and Parallel Dilated Convolutions. Remote Sens. 2025, 17, 953. [Google Scholar] [CrossRef]
- Wang, Z.; Li, C.; Xu, H.; Zhu, X.; Li, H. Mamba YOLO: A Simple Baseline for Object Detection with State Space Model. Proc. AAAI Conf. Artif. Intell. 2025, 39, 8205–8213. [Google Scholar] [CrossRef]
- Kurmashev, I.; Semenyuk, V.; Lupidi, A.; Alyoshin, D.; Kurmasheva, L.; Cantelli-Forti, A. Study of the Optimal YOLO Visual Detector Model for Enhancing UAV Detection and Classification in Optoelectronic Channels of Sensor Fusion Systems. Drones 2025, 9, 732. [Google Scholar] [CrossRef]
















| Interface | Topic/Endpoint | Key Fields | Notes (Rate/QoS/Payload) |
|---|---|---|---|
| MQTT | uav/alert | time-stamp (UTC), uav_id, class (swimmer/boat), conf, bbox [x, y, w, h], geo [lat, lon, alt] | Event-driven (max 5 Hz); QoS 1; JSON payload ~300 bytes; ultra-low bandwidth. |
| MQTT | uav/status | batt_volt, link_quality, glare_idx, system_temp | 1 Hz; QoS 0; health/status monitoring; asynchronous publish (non-blocking). |
| WebSocket | /ws/keyframe | image_base64 (JPEG), detection_id | 0.2–1 Hz; transmit keyframes only when a detection requires operator confirmation (~50–100 KB per image). |
| MAVLink | OBSTACLE_DISTANCE (custom) | distance, angle, sensor_type | 2 Hz; UAV publishes basic obstacle distances (if needed for AP). |
| Dataset | Src | Res | Count | Split (tr/va/te) | Tiny Ratio | Glare |
|---|---|---|---|---|---|---|
| SeaDronesSee v2 | Pub. | ~4K | ~14,227 (Tr 8930/Va 1547/Te 3750) | 63/11/26% | VH (~91%) | Med. (natural) |
| In-house UAV | Ours | 4K (3840 × 2160) | ~2500 | 70/20/10% | Med (~60%) | Sev. (glare + whitecap) |
| Metric | Recommended Definition | Operational Meaning |
|---|---|---|
| APsmall/Recallsmall | Compute on objects with bbox area <32 × 32 after resize (or define bins). | Tiny-vessel sensitivity. |
| FPIglare | False positives per image on glare-heavy subset. | Operator workload/false-alarm control. |
| Latency (mean/p95) | Per-frame time: decode + preprocess + infer + postprocess. | Real-time feasibility. |
| FPS | Steady-state FPS at batch = 1 after warm-up. | Throughput trade-off. |
| Latency jitter (σL, p99L) | Real-time reliability. | Compute standard deviation (σ) and tail (p99) of end-to-end latency over ≥10k frames; report frame-drop rate under wireless congestion and high-glare segments. |
| Energy per frame (mJ/frame), FPS/W | Efficiency (SWaP). | Measure average power (W) during steady-state inference; derive mJ/frame = 1000·P_avg/FPS and FPS/W = FPS/P_avg for fair embedded comparisons. |
| IDF1 | ID F1 score measuring identity-preserving association over time. | Higher is better; complements MOTA by emphasizing identity continuity. |
| HOTA | Higher Order Tracking Accuracy balancing detection and association errors. | Reported with TrackEval to avoid overemphasis on detection-only improvements. |
| IDSW (IDS) | Number of identity switches during tracking. | Lower indicates more stable tracking and data association. |
| Model Variant | P2 | CBAM | NWD | SAHI | RGF | BLSF | mAP@0.5 (%) | APsmall (%) | Recallsmall (%) | FPIglare |
|---|---|---|---|---|---|---|---|---|---|---|
| YOLOv8n (Baseline) | - | - | - | - | - | - | 58.4 | 18.4 | 24.5 | 3.5 |
| YOLO11n (Vanilla) | - | - | - | - | - | - | 61.2 | 21.3 | 28.1 | 3.2 |
| YOLO11n + P2 | - | - | - | - | - | 64.5 | 32.8 | 41.2 | 3.4 | |
| YOLO11n + CBAM | - | - | - | - | - | 61.8 | 21.9 | 28.5 | 1.8 | |
| YOLO11n + NWD | - | - | - | - | - | 62.4 | 25.6 | 31.4 | 3.0 | |
| Resi-YOLO (Core) | - | - | - | 65.1 | 31.5 | 39.8 | 1.9 | |||
| Resi-YOLO + SAHI | - | - | 67.8 | 36.2 | 44.5 | 2.1 | ||||
| Resi-YOLO + RGF | 65.0 | 31.4 | 39.4 | 1.5 | ||||||
| Resi-YOLO + BLSF | 65.0 | 31.3 | 39.3 | 1.6 | ||||||
| Resi-YOLO + Geom Filters | - | 64.9 | 31.1 | 38.9 | 1.2 | |||||
| Resi-YOLO (All-in) | 67.5 | 35.8 | 43.6 | 1.4 |
| GSS Range (Score) | Environmental Description | Sample Ratio | Baseline Recall | Resi-YOLO Recall | Recall Gain | Baseline mAP@0.5 | Resi-YOLO mAP@0.5 |
|---|---|---|---|---|---|---|---|
| Low (0.0–0.3) | Soft illumination, no direct reflections | 55% | 0.58 | 0.66 | +0.08 | 61.2% | 65.1% |
| Medium (0.3–0.6) | Moderate sea-surface glitter, afternoon sunlight | 30% | 0.51 | 0.61 | +0.10 | 54.5% | 61.8% |
| High (0.6–1.0) | Extreme specular reflections, intense glare | 15% | 0.30 | 0.45 | +0.15 | 41.2% | 53.7% |
| Deployment Condition | Detector | Tracker | MOTA | IDF1 | IDS | Perception Latency (ms) |
|---|---|---|---|---|---|---|
| LIVE-RTSP (no impairment) | YOLO11n | ByteTrack | 61.5 | 66.8 | 198 | 52 |
| LIVE-RTSP (no impairment) | Resi-YOLO | ByteTrack | 66.8 | 71.5 | 142 | 90 |
| LAG-50 ms (fixed delay) | Resi-YOLO | ByteTrack | 64.3 | 69.1 | 173 | >100 |
| LAG-50 ms (fixed delay) | Resi-YOLO | MCU + TSMR | 66.1 | 70.8 | 155 | >100 |
| JITTER-20 ms (+drop) | Resi-YOLO | ByteTrack | 61.9 | 65.7 | 221 | Variable |
| JITTER-20 ms (+drop) | Resi-YOLO | MCU + TSMR | 63.4 | 67.2 | 189 | Variable |
| Model | Core Technique | mAP@0.5 | FPS (Orin Nano) | Glare Robustness | Fault-Tolerance Design | |
|---|---|---|---|---|---|---|
| S3Det | Feedback Cut-and-Paste Augmentation | 73.9% | 39.4% | ~10.2 | Medium | None |
| YOLOv12n | Area Attention | 62.4% | 24.1% | ~18.5 | Low | None |
| YOLO11n-Pico | Context Transformer | 54.8% | 21.5% | ~25.0 | Medium | None |
| MambaYOLO | Linear State-Space Model (SSM) | 59.2% | 23.8% | ~15.5 | Medium | None |
| Resi-YOLO (Ours) | P2 + CBAM + Heterogeneous Dual-Brain | 65.1% | 31.5% | 12.8† | High | TSMR + MCU |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Tsai, S.-E.; Hsieh, C.-H. Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs. Drones 2026, 10, 226. https://doi.org/10.3390/drones10030226
Tsai S-E, Hsieh C-H. Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs. Drones. 2026; 10(3):226. https://doi.org/10.3390/drones10030226
Chicago/Turabian StyleTsai, Shang-En, and Chia-Han Hsieh. 2026. "Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs" Drones 10, no. 3: 226. https://doi.org/10.3390/drones10030226
APA StyleTsai, S.-E., & Hsieh, C.-H. (2026). Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs. Drones, 10(3), 226. https://doi.org/10.3390/drones10030226

