Edge-Enhanced YOLOV8 for Spacecraft Instance Segmentation in Cloud-Edge IoT Environments
Abstract
1. Introduction
- (1)
- The CSPPF module is used to enhance the backbone of the YOLOV8 model [4], which leverages pyramid pooling operations and cross-stage merging strategies for fusing multi-scale feature maps. This improvement greatly boosts the backbone network’s capability to extract information from feature maps, reducing missed or false detections in multi-target environments.
- (2)
- The head structure of the improved YOLOV8 model utilizes the WDIOU loss function as its primary box regression loss function. A non-linear and dynamic focusing strategy is utilized by this function to increase the model’s precision in locating target objects and enhancing mask quality in instance segmentation tasks, effectively addressing the limitations of the original YOLOV8 model and improving convergence speed.
- (3)
- To the best of our knowledge, this is the first lightweight instance segmentation model designed for spacecraft imagery within a cloud-edge IoT architecture, achieving a balanced trade-off between accuracy, speed, and deployability.
2. Related Works
2.1. Cloud-Edge IoT and Edge Intelligence
2.2. Instance Segmentation Methods
2.2.1. Two-Stage Methods
2.2.2. One-Stage Methods
3. Proposed Model
3.1. Cloud-Edge IoT Architecture for Spacecraft Monitoring
3.2. Algorithm Details
3.2.1. CSPPF Module
- (1)
- Let the input image be X. X undergoes convolution, batch normalization, and SiLU activation to obtain Z1. This operation effectively extracts complex and high-order features and serves as a foundation for the following stage of feature fusion. The detailed process is illustrated in Equation (1):
- (2)
- After the convolution operation on Z1, batch normalization combined with SiLU activation is performed, followed by three down-sampling operations to obtain H1, H2, and H3. Finally, H1, H2, and H3 are concatenated to obtain H4. The specific process is illustrated in Equation (2):
- (3)
- H5 was obtained after concatenating H1, H2, H3, and Z1 and performing convolutions, batch normalization, and SiLU activation function operations. The specific process is illustrated in Equation (3):
- (4)
- The input X undergoes convolution, batch normalization, and SiLU activation, followed by concatenation with H5. Then the result is then convolved, batch normalized, and SiLU activated to obtain H6. The specific process is shown in Equation (4):
3.2.2. WDIOU Loss Function
4. Experiment
4.1. Experimental Parameters
4.2. Evaluation Metrics
4.3. Experimental Dataset
4.4. Ablation Study
| Model | Epoch | Pmask% | AR% | mAPmask@0.5:0.95% |
|---|---|---|---|---|
| YOLOV8 + CIOU | 150 | 91.8 | 88.8 | 91.9 |
| YOLOV8 + WDIOU | 150 | 92.4 | 89.5 | 92.3 |
| YOLOV8 + WDIOU + CSPPF | 150 | 93.9 | 90.1 | 93.6 |
4.5. Comparison of Different Models
| Model | Epoch | Pmask% | AR% | mAPmask@0.5:0.95% |
|---|---|---|---|---|
| Yolact | 150 | 71.8 | 70.0 | 70.5 |
| Yolact++ | 150 | 79.5 | 72.5 | 75.9 |
| YOLOV5 | 150 | 90.5 | 80.0 | 66.3 |
| YOLOV9 | 150 | 93.0 | 87.2 | 74.4 |
| YOLOV12 | 150 | 93.2 | 88.4 | 92.7 |
| YOLOV8(baseline) | 150 | 91.8 | 88.8 | 91.9 |
| Ours | 150 | 93.9 | 90.1 | 93.6 |
| Model | Layers | Parameters | GFLOPs | Inference Time (ms) |
|---|---|---|---|---|
| YOLOV8 | 151 | 3.41 | 12.1 | 38.6 |
| YOLOV9 | 380 | 2.78 | 14.9 | 29.3 |
| YOLOV12 | 294 | 2.86 | 9.9 | 22.7 |
| Ours | 165 | 4.05 | 11.7 | 26.3 |
4.6. Comparison of Model Prediction Effects
4.7. Training Process Analysis
4.8. Simulation Results and Analysis Based on EdgeCloudSim
4.8.1. Simulation Environment Configuration
- (1)
- Based on the actual inference process of the algorithm presented in this paper, key computational characteristic indicators were collected and quantified, including the computational load required for a single inference (in millions of instructions, MI), memory usage (in megabytes, MB), and inference time (in milliseconds, ms).
- (2)
- The task generator and related configuration files in the EdgeCloudSim simulation platform were modified, and the key computational characteristics indicators in the previous step were injected as parameters into the simulation task model, thereby ensuring that the computational load of the task in the simulation environment was numerically aligned with the actual algorithm behavior.
4.8.2. Core Metrics Analysis
- (1)
- The impact of the number of devices on the simulation results
- (2)
- Performance comparison of scheduling strategies
| Strategy | Failure Rate (%) | Service Time (s) | Processing Time (s) | Average Network Latency (s) | Server Utilization |
|---|---|---|---|---|---|
| WORST_FIT | 0.072 | 0.112 | 0.043 | 0.0681 | 0.098 |
| NEXT_FIT | 0.105 | 0.158 | 0.081 | 0.0679 | 0.156 |
| RANDOM_FIT | 0.098 | 0.153 | 0.080 | 0.0683 | 0.148 |
| FIRST_FIT | 0.365 | 0.574 | 0.510 | 0.0682 | 1.387 |
| BEST_FIT | 0.378 | 0.582 | 0.522 | 0.0680 | 1.453 |
- (3)
- Verification of Key Configuration Rationality
4.8.3. Bottleneck Analysis and Mitigation Strategies for the Three-Layer Architecture
- (1)
- Edge Node Resource Saturation Bottleneck
- (2)
- Mobile-Edge Network Congestion Bottleneck
- (3)
- Cloud Fallback Latency Bottleneck
- (4)
- Control Flow Latency and Reliability BottleneckIssue: The control flow, which transmits updated models and command signals from the cloud to edge nodes, can suffer from high latency, packet loss, or security vulnerabilities, especially over long-distance space communication links. Delays or failures in receiving updated models or control commands can hinder the system’s ability to adapt to new scenarios, execute time-sensitive maneuvers, or respond to anomalies in a timely manner.Mitigation: To ensure robust and timely control flow, several strategies can be adopted:
- (a)
- Prioritized and Predictable Scheduling: Implement priority-based transmission scheduling for control messages to ensure they are delivered ahead of routine data traffic.
- (b)
- Reliable Transmission Protocols: Utilize reliable transport protocols (e.g., CCSDS with retransmission mechanisms) or forward error correction (FEC) to enhance delivery reliability over lossy channels.
- (c)
- Incremental and Compressed Updates: Instead of transmitting full model weights, employ incremental updates, model patching, or delta compression to reduce the size of control packets, thereby lowering transmission time and bandwidth consumption.
- (d)
- Edge-side Model Validation and Rollback: Implement versioning and validation mechanisms at the edge to safely integrate new models or commands, with the ability to rollback to a previous stable state if an update causes instability.
- (5)
- Memory Resource Fragmentation Bottleneck
4.8.4. Summary
4.9. Limitations, Open Questions and Future Directions
- (1)
- This limitation reveals several open questions for the research community:
- (a)
- Dynamic Feature Extraction: How can we design lightweight, adaptive neural modules that dynamically adjust their receptive fields and fusion strategies based on real-time scene complexity (e.g., target count, background clutter)?
- (b)
- Edge-Scene Awareness: Is it feasible to deploy an ultra-lightweight scene classifier on edge devices to guide the selection or reconfiguration of downstream vision models, optimizing the accuracy-efficiency balance on a per-input basis?
- (c)
- Quantifying Scene Complexity for Space: How can we rigorously define and quantify “scene complexity” in orbital imagery? Metrics based on target density, texture entropy, or semantic clutter could form the basis for adaptive algorithms.
- (d)
- Task-Aware Loss Functions: Can loss functions be designed to dynamically weight learning objectives based on inferred scene characteristics, focusing more on localization in simple scenes and on discrimination in complex ones?
- (2)
- Towards Deployment on Space-Qualified HardwareThe promising efficiency metrics and accuracy obtained on a constrained edge-GPU (RTX 4060) establish a strong foundation for the next logical step: assessing the feasibility of deployment on lightweight, space-qualified embedded hardware. Future work must transition from algorithmic validation to system-level implementation studies. This involves:
- (a)
- Hardware Selection and Profiling: Benchmarking the model on representative aerospace processors (e.g., radiation-hardened GPUs, FPGAs, or SoCs like NVIDIA Jetson Orin variants) to measure real-time throughput, power consumption, and thermal profiles under simulated orbital conditions.
- (b)
- Model Co-optimization: Applying further deployment-oriented optimizations such as int8 quantization, pruning, and knowledge distillation to meet stringent memory and latency budgets without compromising critical accuracy.
- (c)
- System Integration and Testing: Evaluating the model within a hardware-in-the-loop (HIL) simulation framework that includes sensor inputs (e.g., camera feeds), downstream tasks (e.g., pose estimation), and the spaceborne edge computing stack. Success in this endeavor would bridge the gap between a high-performance algorithm and a field-deployable, intelligent component for autonomous on-orbit services, ultimately testing the core hypothesis of cloud-edge IoT in the most demanding operational environment.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chu, G.L. Study on the Key Technologies of Automatic Identification for Cooperative Target on Spacecraft. Ph.D. Thesis, University of Chinese Academy of Sciences (Changchun Institute of Optics, Fine Mechanics and Physics), Changchun, China, 2015. [Google Scholar]
- Cui, N.G.; Wang, P.; Guo, J.F. A Review of On-Orbit Servicing. J. Astronaut. 2007, 28, 805–811. [Google Scholar]
- Ling, L.X. Development of Space Rendezvous and Docking Technology in Past 40 Years. Spacecr. Eng. 2007, 16, 70–77. [Google Scholar]
- Yaseen, M. What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector. arXiv 2023, arXiv:2409.07813. [Google Scholar]
- Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
- Mao, Y.; You, C.; Zhang, J.; Huang, K.; Letaief, K.B. A survey on mobile edge computing: The communication perspective. IEEE Commun. Surv. Tutor. 2017, 19, 2322–2358. [Google Scholar] [CrossRef]
- Zhou, Z.; Chen, X.; Li, E.; Zeng, L.; Luo, K.; Zhang, J. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 2019, 107, 1738–1762. [Google Scholar] [CrossRef]
- Huang, T.; Li, H.; Zhou, G.; Li, S.B.; Wang, Y. Survey of Research on Instance Segmentation Methods. J. Front. Comput. Sci. Technol. 2023, 17, 810–825. [Google Scholar]
- Wu, T.; Yang, X.; Song, B.; Wang, N.; Gao, X.; Kuang, L.; Nan, X.; Chen, Y.; Yang, D. T-SCNN: A two-stage convolutional neural network for space target recognition. In IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium; IEEE: New York, NY, USA, 2019; pp. 1334–1337. [Google Scholar]
- Armstrong, W.; Draktontaidis, S.; Lui, N. Semantic Image Segmentation of Imagery of Unmanned Spacecraft Using Synthetic Data; Technical Report; IEEE: New York, NY, USA, 2021. [Google Scholar]
- Hariharan, B.; Arbeláez, P.; Girshick, R.; Malik, J. Simultaneous detection and segmentation. In Computer Vision–ECCV 2014: 13th European Conference, Proceedings, Part VII 13, Zurich, Switzerland, 6–12 September 2014; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 297–312. [Google Scholar]
- Arbeláez, P.; Pont-Tuset, J.; Barron, J.T.; Marques, F.; Malik, J. Multiscale combinatorial grouping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 328–335. [Google Scholar]
- Dai, J.; He, K.; Li, Y.; Ren, S.; Sun, J. Instance-sensitive fully convolutional networks. In Computer Vision–ECCV 2016: 14th European Conference, Proceedings, Part VI 14, Amsterdam, The Netherlands, 11–14 October 2016; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 534–549. [Google Scholar]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 1–27. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV); IEEE: Piscataway, NJ, USA, 2017; pp. 2961–2969. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems (NIPS); Neural Information Processing Systems Foundation, Inc.: San Diego, CA, USA, 2015; p. 28. [Google Scholar]
- Gao, N.; Shan, Y.; Wang, Y.; Zhao, X.; Yu, Y.; Yang, M.; Huang, K. Ssap: Single-shot instance segmentation with affinity pyramid. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 642–651. [Google Scholar]
- Ke, L.; Danelljan, M.; Li, X.; Tai, Y.W.; Tang, C.K.; Yu, F. Mask transfiner for high-quality instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 4412–4421. [Google Scholar]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9157–9166. [Google Scholar]
- Hurtik, P.; Molek, V.; Hula, J.; Vajgl, M.; Vlasanek, P.; Nejezchleba, T. Poly-YOLO: Higher speed, more precise detection and instance segmentation for YOLOv3. Neural Comput. Appl. 2022, 34, 8275–8290. [Google Scholar] [CrossRef]
- He, J.; Li, P.; Geng, Y.; Xie, X. Fastinst: A simple query-based model for real-time instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 23663–23672. [Google Scholar]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. In Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 9–15 December 2024; pp. 107984–108011. [Google Scholar]
- Zhao, F.; Shao, X.L.; Wang, J.Q.; Chen, Y.J.; Xi, D.H.; Liu, Y.Y.; Chen, J.D.; Sasaki, J.; Mizuno, K. A novel underwater Holothurians monitoring system using consumer-grade amphibious UAV with Mamba-based Super-Resolution Reconstruction and enhanced YOLOv10. Mar. Environ. Res. 2025, 212, 107510. [Google Scholar] [CrossRef] [PubMed]
- Wang, S. Automated non-PPE detection on construction sites using YOLOv10 and transformer architectures for surveillance and body worn cameras with benchmark datasets. Sci. Rep. 2025, 15, 27043. [Google Scholar] [CrossRef] [PubMed]
- Ammar, M. Enhancing real-time instance segmentation for plant disease detection with improved YOLOv8-Seg algorithm. Int. J. Inf. Technol. Secur. 2024, 16, 27–38. [Google Scholar] [CrossRef]
- Ma, J.; Li, F.F.; Wang, B. U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation. arXiv 2024, arXiv:2401.04722. [Google Scholar] [CrossRef]
- Zhao, F.; Xu, D.; Ren, Z.; Shao, X.; Wu, Q.; Liu, Y.; Mizuno, K. Mamba-based super-resolution and semi-supervised YOLOv10 for freshwater mussel detection using acoustic video camera: A case study at Lake Izunuma, Japan. Ecol. Inform. 2025, 90, 103324. [Google Scholar] [CrossRef]
- You, S.; Li, B.; Chen, Y.; Ren, Z.; Liu, Y.; Wu, Q.; Zhao, F. Rose-Mamba-YOLO: An enhanced framework for efficient and accurate greenhouse rose monitoring. Front. Plant Sci. 2025, 16, 1607582. [Google Scholar] [CrossRef] [PubMed]
- Chen, M.; Chen, W.J.; Niu, Y.F.; Qi, P.; Wang, F.C. Yolov8_Pro_Cssp. [Computer Software, GitHub Repository]. 2025. Available online: https://github.com/cehndashuai/yolov8_pro_cssp.git (accessed on 13 January 2026).
- Dung, H.A.; Chen, B.; Chin, T.J. A spacecraft dataset for detection, segmentation and parts recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2012–2019. [Google Scholar]
- Bolya, D.; Zhou, C.; Xiao, F.Y. YOLACT++: Better Real-time Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 1108–1121. [Google Scholar] [CrossRef] [PubMed]
- Ultralytics. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 13 January 2026).
- Wang, C.Y.; Yeh, I.H.; Mark Liao, H.Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 1–21. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]












Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Chen, M.; Chen, W.; Niu, Y.; Qi, P.; Wang, F. Edge-Enhanced YOLOV8 for Spacecraft Instance Segmentation in Cloud-Edge IoT Environments. Future Internet 2026, 18, 59. https://doi.org/10.3390/fi18010059
Chen M, Chen W, Niu Y, Qi P, Wang F. Edge-Enhanced YOLOV8 for Spacecraft Instance Segmentation in Cloud-Edge IoT Environments. Future Internet. 2026; 18(1):59. https://doi.org/10.3390/fi18010059
Chicago/Turabian StyleChen, Ming, Wenjie Chen, Yanfei Niu, Ping Qi, and Fucheng Wang. 2026. "Edge-Enhanced YOLOV8 for Spacecraft Instance Segmentation in Cloud-Edge IoT Environments" Future Internet 18, no. 1: 59. https://doi.org/10.3390/fi18010059
APA StyleChen, M., Chen, W., Niu, Y., Qi, P., & Wang, F. (2026). Edge-Enhanced YOLOV8 for Spacecraft Instance Segmentation in Cloud-Edge IoT Environments. Future Internet, 18(1), 59. https://doi.org/10.3390/fi18010059
