A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment
Abstract
1. Introduction
1.1. Challenges and Motivation
- Real-Time Detection in Dynamic EnvironmentsIn the context of robotic-assisted calibration tasks, it is essential for the robot to continuously capture images during its movement and to identify the target equipment in real time. This requirement necessitates the deployment of a highly efficient detection algorithm that optimally balances detection speed and accuracy. However, existing object detection methodologies often face challenges in maintaining high performance under real-time constraints, particularly on resource-constrained embedded devices [11,12].
- Limited Training Data AvailabilityAcquiring labeled images of nuclear medicine imaging equipment poses considerable challenges due to the specialized nature of imaging environments and the substantial costs associated with data collection [13,14]. Current datasets contain approximately 800 images, which is significantly below the requirements for modern deep learning-based detection models [15]. The shortage of high-quality training data remains a fundamental obstacle in this area of research. The lack of representative samples ultimately harms their performance in real-world situations.
- Lightweight Deployment for Embedded SystemsRobotic systems utilized in medical environments often function under strict computational and power constraints. The implementation of a high-performance detection algorithm requires the deployment of lightweight models with reduced memory and processing demands [16,17]. While numerous models have been optimized for general embedded applications, achieving a balance between lightweight deployment and high detection performance remains a significant challenge [18,19].
1.2. Research Objectives and Contributions
- Enhancing Detection Accuracy with Shape-Aware ModulesDevelop and integrate a Shape-Aware module to enhance the detection capabilities of nuclear medicine imaging equipment characterized by circular PET/CT gantries. Optimize the detection head by incorporating techniques such as deformable convolution and anchor-free structures to more effectively capture large and shape-specific targets.
- Addressing Data Scarcity with Few-Shot LearningGAN-based data augmentation is applied to synthesize additional training images, aiming to enlarge the dataset without compromising the appearance or structural details of the target equipment [20,21]. Employ transfer learning with pre-trained weights to enhance model performance on small-scale datasets, thereby improving generalization capabilities.
- Achieving Real-Time Performance in Embedded SystemsInference speed is optimized by incorporating model compression strategies, including pruning and quantization, that help lower the computational burden. Utilize TensorRT acceleration to ensure the framework can fulfill the real-time requirements of robotic-assisted tasks in environments with limited resources.
- Validating the Framework Through Comprehensive ExperimentsAssess the proposed framework utilizing a benchmark dataset of nuclear medicine imaging equipment, evaluating performance in terms of detection accuracy (mAP), speed (FPS), and resource efficiency (model size and memory usage). Examine the framework’s efficacy in real-world embedded applications.
2. Related Works
2.1. Object Detection and Lightweight Models
2.2. Shape-Aware Attention Mechanisms in Detection
3. Methods
3.1. Overall Pipeline and Data Flow
3.2. Backbone and Head Structure Optimization
3.3. Shape-Aware Module Design
- Edge extraction.
- 2.
- Shape scoring.
- 3.
- Attention fusion.
Algorithm 1: Shape-Aware Module Processing Pipeline | |
Input: Image I, threshold τ | |
Output: Attention mask M | |
1 | //Edge extraction |
2 | Kx = I convolved with Sobel\_x; Ky = I convolved with Sobel\_y |
3 | for each pixel (x, y) do: |
4 | E (x, y) = √(Kx(x, y)^2 + Ky(x, y)^2) |
5 | end |
6 | normalize E to [0, 1] |
7 | for each pixel (x, y) do: |
8 | if E (x, y) < τ then E (x, y) = 0 |
9 | end |
10 | //Shape scoring |
11 | identify connected components C from thresholded E |
12 | for each component c ∈ C do: |
13 | A_c = area(c); P_c = perimeter(c) |
14 | C_c = (4 * π * A_c)/(P_c^2)//circularity score |
15 | for each pixel (x, y) in c do: |
16 | S (x, y) = C_c |
17 | end |
18 | end |
19 | normalize S to [0, 1] |
20 | //Attention fusion |
21 | for each pixel (x, y) do: |
22 | M (x, y) = σ (W_E * E (x, y) + W_S * S (x, y)) |
23 | end |
3.4. Data Augmentation and Transfer Learning Strategies
- Mosaic Augmentation and Horizontal FlipWe applied Mosaic augmentation, which combines four different training images into one during training. In implementation, a canvas is divided into four quadrants, and four images are scaled and placed into each quadrant to form a single mosaic image. During this augmentation, the bounding box coordinates from each constituent image were scaled and translated to the mosaic image’s coordinate space to maintain correct localization labels. Mosaic augmentation allows the network to learn to recognize objects against a wide range of surroundings and even outside their normal context. It has been shown to improve detection of objects and reduce sensitivity to background clutter. By seeing more diverse contexts per image, the model gains robustness and effectively benefits as if trained on a larger batch of images. For additional geometric variety, we also applied random horizontal flipping during training. With 50% probability, an image was mirrored along the vertical axis. This simple augmentation doubles the apparent dataset size and helps the model learn orientation-invariant features. As a result, the detector becomes less biased toward any particular view—indeed, horizontal flipping is widely recognized as one of the most effective augmentations in data-limited settings [51].
- Random OcclusionTo simulate occlusions, a random erasing augmentation was applied. By occluding portions of both the object and its background, the network must learn to infer object characteristics from incomplete visual cues. This masking approach acts as a regularizer by discouraging the network from fixating on any single region or feature, thereby helping to suppress overfitting. The occlusion augmentation improves robustness to real-world scenarios where the object of interest may be partially obscured by other anatomical structures or imaging artifacts.
- Brightness AdjustmentRandom brightness adjustments were included as a photometric augmentation. During preprocessing, we applied random brightness jitter to each image, scaling pixel intensities by a factor drawn uniformly from 0.8 to 1.2. This was implemented by scaling pixel intensity values up or down by a factor chosen uniformly from a preset interval. By training on both brighter and darker versions of images, the detector becomes more invariant to lighting changes. This improves its ability to generalize images with different exposure levels or contrasts, which is important given the limited number of training examples.
3.5. Deployment and Model Optimization on Target Hardware
3.5.1. Post-Training Quantization
3.5.2. Magnitude-Based Channel Pruning
3.5.3. TensorRT Conversion and GPU Acceleration
- Conversion of the PyTorch-trained YOLOv8n model to ONNX, including post-processing (NMS).
- Engine optimization via TensorRT, allowing mixed precision (FP16 and INT8), fusion of operations, kernel auto-tuning, and calibration scales application.
- Serialization of the optimized TensorRT engine, tailored for RTX 3060 tensor cores supporting INT8 arithmetic.
4. Experiments and Results
4.1. Experimental Environment and Setup
4.2. Dataset Construction and GAN-Based Augmentation
4.3. Evaluation Metrics and Implementation Details
4.3.1. Detection Accuracy Metrics
4.3.2. Efficiency Metrics
4.3.3. Training and Inference Details
4.4. Ablation Studies
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yao, Y.; Tian, J.; Zhang, R.; Liu, P.; Gao, X.; Zhou, W.; Luo, L. Nanobody-Based Radiotracers in Nuclear Medicine: Advances, Challenges, and Future Perspectives. Chin. Chem. Lett. 2025. [Google Scholar] [CrossRef]
- Lee, J.; Kim, T. Current Status and Future Perspectives of Nuclear Medicine in Prostate Cancer from Imaging to Therapy: A Comprehensive Review. Biomedicines 2025, 13, 1132. [Google Scholar] [CrossRef]
- Petryakova, A.V.; Чипиra, Л.A.; Boдoвaтoв, A.B.; Ya, M. Smolyarchuk Equipment Quality Control during Patient Radiation Protection Optimisation in Radionuclide Diagnostics. Radiatsionnaya Gygiena Radiat. Hyg. 2023, 16, 81–90. [Google Scholar] [CrossRef]
- Yang, W.-D.; Kang, F.; Chen, Y.; Zhu, Z.; Wang, F.; Qin, C.; Du, J.; Lan, X.; Wang, J. Landscape of Nuclear Medicine in China and Its Progress on Theranostics. J. Nucl. Med. 2024, 65, 29S–37S. [Google Scholar] [CrossRef] [PubMed]
- Bayareh-Mancilla, R.; Medina-Ramos, L.A.; Toriz-Vázquez, A.; Hernández-Rodríguez, Y.M.; Cigarroa-Mayorga, O.E. Automated Computer-Assisted Medical Decision-Making System Based on Morphological Shape and Skin Thickness Analysis for Asymmetry Detection in Mammographic Images. Diagnostics 2023, 13, 3440. [Google Scholar] [CrossRef]
- Hallab, R.; Eddaoui, K.; Aouad, N.B.R. The Quality Assurance for the PET/CT in Nuclear Medicine—Evaluation of the Daily Quality Control of The Positron Emission Tomography. Biomed. Pharmacol. J. 2022, 15, 1589–1595. [Google Scholar] [CrossRef]
- Ansari, S.; Ansari, A.; Shaikh, F.A.A.; Patne, S.; Sheikh, S.; Joseph, S.; Malik, M.A.A.S.; Khan, N. From Legacy Systems to Intelligent Automation: Bridging the Gap in Industrial Applications (A Survey). In Proceedings of the International Conference on Industrial Engineering and Operations Management, Hyderabad, India, 7–9 November 2024. [Google Scholar] [CrossRef]
- Noman, A.A.; Eva, A.N.; Yeahyea, T.B.; Khan, R. Computer Vision-Based Robotic Arm for Object Color, Shape, and Size Detection. J. Robot. Control JRC 2022, 3, 180–186. [Google Scholar] [CrossRef]
- Liu, J.; Liu, Z. The Vision-Based Target Recognition, Localization, and Control for Harvesting Robots: A Review. Int. J. Precis. Eng. Manuf. 2023, 25, 409–428. [Google Scholar] [CrossRef]
- Li, Y.; Liu, W.; Li, L.; Zhang, W.; Xu, J.; Jiao, H. Vision-Based Target Detection and Positioning Approach for Underwater Robots. IEEE Photonics J. 2023, 15, 1–12. [Google Scholar] [CrossRef]
- Wang, S. Real-Time Object Detection Using a Lightweight Two-Stage Detection Network with Efficient Data Representation. IECE Trans. Emerg. Top. Artif. Intell. 2024, 1, 17–30. [Google Scholar] [CrossRef]
- Singh, B.; Kumar, N.; Ahmed, I.; Yadav, K. Real-Time Object Detection Using Deep Learning. Int. J. Res. Appl. Sci. Eng. Technol. 2022, 10, 3159–3160. [Google Scholar] [CrossRef]
- Sommer, L.; Schumann, A.; Bouma, H.; Stokes, R.J.; Yitzhaky, Y.; Prabhu, R. Deep Learning-Based Drone Detection in Infrared Imagery with Limited Training Data. Secur. Def. 2020, 11542, 1154204. [Google Scholar] [CrossRef]
- Cao, S.; Konz, N.; Duncan, J.; Mazurowski, M.A. Deep Learning for Breast MRI Style Transfer with Limited Training Data. J. Digit. Imaging 2022, 36, 666–678. [Google Scholar] [CrossRef]
- Candemir, S.; Nguyen, X.V.; Folio, L.R.; Prevedello, L.M. Training Strategies for Radiology Deep Learning Models in Data-Limited Scenarios. Radiol. Artif. Intell. 2021, 3, e210014. [Google Scholar] [CrossRef]
- Liu, X.; Chen, Y.; Li, J.; Cangelosi, A. Real-Time Robotic Mirrored Behavior of Facial Expressions and Head Motions Based on Lightweight Networks. IEEE Internet Things J. 2022, 10, 1401–1413. [Google Scholar] [CrossRef]
- Mwitta, C.; Rains, G.C.; Prostko, E. Evaluation of Inference Performance of Deep Learning Models for Real-Time Weed Detection in an Embedded Computer. Ital. Natl. Conf. Sens. 2024, 2, 514. [Google Scholar] [CrossRef]
- Fathurrahman, A.; Bejo, A.; Ardiyanto, I. Lightweight Convolution Neural Network for Image-Based Malware Classification on Embedded Systems. In Proceedings of the 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jakarta, Indonesia, 29–30 January 2022. [Google Scholar] [CrossRef]
- Sun, Q.; Li, P.; He, C.; Song, Q.; Chen, J.; Kong, X.; Luo, Z. A Lightweight and High-Precision Passion Fruit YOLO Detection Model for Deployment in Embedded Devices. Sensors 2024, 24, 4942. [Google Scholar] [CrossRef]
- Beckham, C.; Laradji, I.; López, P.R.; Vázquez, D.; Nowrouzezahrai, D.; Pal, C. Overcoming Challenges in Leveraging GANs for Few-Shot Data Augmentation. arXiv 2022, arXiv:2203.16662. [Google Scholar] [CrossRef]
- Li, S.; Yue, C.; Zhou, H. Few-Shot Face Recognition: Leveraging GAN for Effective Data Augmentation. Electronics 2025, 14, 2003. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Jocher, G.R.; Liu, C.; Hogan, A.; Yu, L.; Rai, P.; Sullivan, T. Ultralytics/Yolov5: Initial Release; Zenodo: Genève, Switzerland, 2020. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar] [CrossRef]
- Aboyomi, D.D.; Daniel, C. A Comparative Analysis of Modern Object Detection Algorithms: YOLO vs. SSD vs. Faster R-CNN. ITEJ Inf. Technol. Eng. J. 2023, 8, 96–106. [Google Scholar] [CrossRef]
- Dahirou, Z.; Zheng, M. Motion Detection and Object Detection: Yolo (You Only Look Once). In Proceedings of the 2021 7th Annual International Conference on Network and Information Systems for Computers (ICNISC), Guiyang, China, 23–25 July 2021; pp. 250–257. [Google Scholar] [CrossRef]
- Wang, X.; Li, H.; Yue, X.; Meng, L. A Comprehensive Survey on Object Detection YOLO. Int. Symp. Adv. Technol. Appl. Internet Things 2023, 1613, 0073. [Google Scholar]
- Ryspayeva, M.; Nishan, A. Enhancing Grayscale Image Synthesis with Deep Conditional GAN and Transfer Learning. In Proceedings of the 2024 IEEE AITU: Digital Generation, Astana, Kazakhstan, 3–4 April 2024. [Google Scholar] [CrossRef]
- Güçlü, E.; Akın, E.; Aydın, İ.; Topkaya, A.; Onan, M.; Şener, T.K. Real-Time Detection of Terminal Burn Defects Using YOLOv7 and TensorRT. In Proceedings of the 2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), Sakhir, Bahrain, 17–19 November 2024. [Google Scholar] [CrossRef]
- Belhaoua, A.; Kimpe, T.R.; Crul, S.; Rettmann, M.E.; Siewerdsen, J.H. TensorRT-Based Surgical Instrument Detection Assessment for Deep Learning on Edge Computing. In Proceedings of the Image-Guided Procedures, Robotic Interventions, and Modeling, San Diego, CA, USA, 18–22 February 2024. [Google Scholar] [CrossRef]
- B, G.P.; G, R.M.L.; Rishekeeshan, A.; Deekshitha. Accelerating Native Inference Model Performance in Edge Devices Using TensorRT. In Proceedings of the 2024 IEEE Recent Advances in Intelligent Computational Systems (RAICS), Kothamangalam, India, 16–18 May 2024. [Google Scholar] [CrossRef]
- Kandel, M.A.; Rizk, F.H.; Hongou, L.; Zaki, A.M.; Khan, H.; El El-Kenawy, E.-S.M. Evaluating the Efficacy of Deep Learning Architectures in Predicting Traffic Patterns for Smart City Development. J. Artif. Intell. Metaheuristics 2023, 6, 26–35. [Google Scholar] [CrossRef]
- Oruganti, R.; Kumar N, S. Efficacy of Deep Learning Models in the Prognosis Task. ECS Trans. 2022, 107, 15207–15220. [Google Scholar] [CrossRef]
- Wang, W.; Chen, W.; Luo, Y.; Long, Y.; Lin, Z.; Zhang, L.; Lin, B.; Cai, D.; He, X. Model Compression and Efficient Inference for Large Language Models: A Survey. arXiv 2024. [Google Scholar] [CrossRef]
- JGuo, J.; Xu, D.; Ouyang, W. Multidimensional Pruning and Its Extension: A Unified Framework for Model Compression. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 13056–13070. [Google Scholar] [CrossRef]
- Malihi, L.; Heidemann, G. Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning. Big Data Cogn. Comput. 2023, 7, 154. [Google Scholar] [CrossRef]
- Zheng, X.; Guan, Z.; Chen, Q.; Wen, G.; Lu, X. A Lightweight Road Traffic Sign Detection Algorithm Based on Adaptive Sparse Channel Pruning. Meas. Sci. Technol. 2024, 36, 016176. [Google Scholar] [CrossRef]
- Wu, Y.; Chen, J. A Lightweight Real-Time System for Object Detection in Enterprise Information Systems for Frequency-Based Feature Separation. Int. J. Semantic Web Inf. Syst. IJSWIS 2023, 19, 1–18. [Google Scholar] [CrossRef]
- Yun, Y.K.; Lin, W. Towards a Complete and Detail-Preserved Salient Object Detection. IEEE Trans. Multimed. 2023, 36, 016176. [Google Scholar] [CrossRef]
- Zeng, X.; Xu, M.; Hu, Y.; Tang, H.; Hu, Y.; Nie, L. Adaptive Edge-Aware Semantic Interaction Network for Salient Object Detection in Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Cai, J.; Lin, S. A Novel Hybrid Model for Video Salient Object Detection. In Proceedings of the 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), Chongqing, China, 6–8 November 2020. [Google Scholar] [CrossRef]
- Chen, G.; Wang, Q.; Dong, B.; Ma, R.; Liu, N.; Fu, H.; Xia, Y. EM-Trans: Edge-Aware Multimodal Transformer for RGB-D Salient Object Detection. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 3175–3188. [Google Scholar] [CrossRef]
- Thomas, H.; Qi, C.R.; Deschaud, J.-E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. KPConv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6411–6420. [Google Scholar] [CrossRef]
- Wang, S.; Liu, X.; Zhu, X.; Zhang, P.; Zhang, Y.; Gao, F.; Zhu, E. Fast Parameter-Free Multi-View Subspace Clustering With Consensus Anchor Guidance. IEEE Trans. Image Process. 2022, 31, 556–568. [Google Scholar] [CrossRef]
- Li, Z.; Liu, W.; Xie, Z.; Kang, X.; Duan, P.; Li, S. FAA-Det: Feature Augmentation and Alignment for Anchor-Free Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–11. [Google Scholar] [CrossRef]
- Tong, L.; Fan, C.; Peng, Z.; Wei, C.; Sun, S.; Han, J. WTBD-YOLOv8: An Improved Method for Wind Turbine Generator Defect Detection. Sustainability 2024, 16, 4467. [Google Scholar] [CrossRef]
- Vanhove, C.; Koole, M.; Costa, P.F.; Schottelius, M.; Mannheim, J.; Kuntner, C.; Warnock, G.; McDougald, W.; Tavares, A.; Bernsen, M. Preclinical SPECT and PET: Joint EANM and ESMI Procedure Guideline for Implementing an Efficient Quality Control Programme. Eur. J. Nucl. Med. Mol. Imaging 2024, 51, 3822–3839. [Google Scholar] [CrossRef]
- Zhang, T.; Jia, X.; Cui, Y.; Zhang, H. GGD-YOLOv8n: A Lightweight Architecture for Edge-Computing-Optimized Allergenic Pollen Recognition with Cross-Scale Feature Fusion. Symmetry 2025, 17, 849. [Google Scholar] [CrossRef]
- Patil, A. Automated Real-Time Sugarcane Node Detection System Using YOLOv8 and Edge Computing. Int. J. Res. Appl. Sci. Eng. Technol. 2025, 13, 580–583. [Google Scholar] [CrossRef]
- Alomar, K.; Aysel, H.I.; Cai, X. Data Augmentation in Classification and Segmentation: A Survey and New Strategies. J. Imaging 2023, 9, 46. [Google Scholar] [CrossRef] [PubMed]
Module | Original Params/M | Optimized Params/M | Reduction/% |
---|---|---|---|
Backbone Stages 1 | 3.16 | 1.83 | 42.1 |
Detection Head | 0.38 | 0.31 | 20.1 |
Total | 3.54 | 2.14 | 39.7 |
Reduction/% | Total Params/M | mAP@0.5/% | Inference FPS |
---|---|---|---|
30 | 2.60 | 91.2 | 90.3 |
42.1 | 2.14 | 89.1 | 101.7 |
50 | 1.89 | 82.5 | 112.0 |
Augmentation Technique | Implementation | Impact on Performance |
---|---|---|
Mosaic Augmentation | Combines four images into one by random quadrant-wise placement. | Diversifies backgrounds and contexts; improves detection of visible objects by exposing varied scenes. |
Horizontal Flip | Mirrors the image horizontally with 50% probability during training. | Introduces orientation variety; helps the model learn viewpoint invariance and doubles the training data. |
Random Occlusion | Masks a random region of the image (e.g., with a black or mean-color patch). | Forces reliance on non-occluded features; increases robustness to occluded objects and acts as regularization to reduce overfitting. |
Brightness Variation | Randomly brightens or darkens the image by a random factor within a defined range. | Improves robustness to illumination changes; ensures stable performance under varying lighting or contrast conditions. |
Configuration | Parameters |
---|---|
Operating system | Ubuntu 20.04 |
CPU | Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz |
GPU | NVIDIA RTX 3060 |
Experimental environment version | Python3.6, PyTorch 2.0, Cuda10.1 |
Dataset | Number of Images | Diversity Index | Inception Score | FID | ||
---|---|---|---|---|---|---|
Mean | Std | Mean | Std | |||
Original (Real) | 480 | 0.62 | ±0.04 | 1.85 | ±0.07 | - |
Augmented (GAN + Real) | 1440 (480 + 960) | 0.85 | ±0.03 | 2.40 | ±0.05 | 25.3 |
Training Set Size | mAP@0.5/% | Precision/% | Recall/% |
---|---|---|---|
480 (real only) | 71.5 | 79.2 | 75.4 |
960 (1× synthetic) | 82.6 | 85.4 | 84.2 |
1440 (3× synthetic, baseline) | 89.1 | 90.2 | 89.7 |
Model | Precision/% | Recall/% | mAP@0.5/% | mAP@0.5:0.95/% | Parameters/M |
---|---|---|---|---|---|
Base-YOLOv8n | 90.2 | 89.7 | 89.1 | 82.4 | 3.54 |
SAM-YOLOv8n | 93.6 | 92.8 | 91.9 | 84.3 | 2.14 |
Base-YOLOv8n (5-fold CV) | 89.8 ± 1.2 | 88.5 ± 1.4 | 88.3 ± 1.1 | 81.9 ± 0.8 | 3.54 |
SAM-YOLOv8n (5-fold CV) | 93.4 ± 0.9 | 91.9 ± 1.0 | 91.1 ± 0.8 | 84.0 ± 0.6 | 2.14 |
Base-YOLOv8n (Holdout) | 89.3 | 88.2 | 88.0 | 81.4 | 3.54 |
SAM-YOLOv8n (Holdout) | 93.1 | 91.3 | 90.7 | 83.2 | 2.14 |
Model | Parameters/M | mAP@0.5:0.95/% | FPS | Size/MB | Latency/ms |
---|---|---|---|---|---|
Base-YOLOv8n | 3.54 | 82.4 ± 0.4 | 79.2 ± 2.3 | 14.2 | 12.5 ± 0.2 |
YOLOv8n + SAM | 3.86 | 86.5 ± 0.3 | 88.1 ± 1.5 | 15.8 | 11.1 ± 0.2 |
YOLOv8n (Slim) | 2.14 | 81.7 ± 0.5 | 101.7 ± 3.0 | 8.6 | 10.2 ± 0.2 |
SAM-YOLOv8n | 2.63 | 84.3 ± 0.4 | 96.4 ± 2.2 | 9.3 | 9.1 ± 0.2 |
Model | Parameters/M | mAP@0.5:0.95/% | Latency/ms |
---|---|---|---|
YOLOv5n- | ~1.9 | 79.8 ± 0.5 | 11.5 ± 0.3 |
MobileNet-SSD | ~4.5 | 76.5 ± 0.6 | 10.8 ± 0.4 |
EfficientDet-Lite0 | ~4.0 | 81.0 ± 0.4 | 13.0 ± 0.5 |
YOLOv7-tiny | ~6.0 | 82.1 ± 0.4 | 11.9 ± 0.3 |
YOLOv8n | ~3.5 | 82.4 ± 0.4 | 12.5 ± 0.2 |
SAM-YOLOv8n | ~2.6 | 84.3 ± 0.4 | 9.1 ± 0.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, W.; Xu, G.; Song, A. A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment. Appl. Sci. 2025, 15, 8839. https://doi.org/10.3390/app15168839
Jiang W, Xu G, Song A. A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment. Applied Sciences. 2025; 15(16):8839. https://doi.org/10.3390/app15168839
Chicago/Turabian StyleJiang, Weiping, Guozheng Xu, and Aiguo Song. 2025. "A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment" Applied Sciences 15, no. 16: 8839. https://doi.org/10.3390/app15168839
APA StyleJiang, W., Xu, G., & Song, A. (2025). A Shape-Aware Lightweight Framework for Real-Time Object Detection in Nuclear Medicine Imaging Equipment. Applied Sciences, 15(16), 8839. https://doi.org/10.3390/app15168839