Deep Learning Computer Vision-Based Automated Localization and Positioning of the ATHENA Parallel Surgical Robot
Abstract
1. Introduction
- Variability—manual estimation is prone to inter-operator and intra-operator inconsistencies [8];
- Lack of real-time feedback—mechanical guides cannot adapt to changes in anatomical configuration or trocar motion;
- Sensitivity to user experience—novice surgeons exhibit higher alignment times and greater deviations from the optimal remote center of motion (RCM) trajectory [9].
- Fiducial-based registration using optical or radiopaque markers provides high accuracy, but it can complicate sterility management and workflow in the operating room because fiducials must be physically introduced and maintained within the sterile field—either attached to instruments, mounted on sterile adapters, or placed near the patient. This introduces additional handling steps (placement, sterile fixation, verification of visibility, and calibration/registration) and may require single-use sterile mounts or repeated re-sterilization of marker holders. During surgery, fiducials can become occluded or contaminated by drapes, hands, instrument motion, or blood/fluid, which can force repositioning or re-registration and interrupt the workflow. For these reasons, markerless approaches are attractive when the goal is to minimize added hardware and procedural steps, while still enabling reliable localization [10].
- Electromagnetic tracking systems eliminate line-of-sight constraints but suffer from field distortions caused by surrounding metallic OR equipment, reducing reliability in laparoscopic environments [11].
- Optical tracking systems (stereo cameras or infrared reflectors) offer high precision, yet their integration requires additional hardware, careful calibration, and unobstructed fields-of-view—conditions not always achievable in crowded OR setups [12].
- Robot encoding-based registration, relying solely on internal kinematics, lacks external environmental awareness and cannot autonomously compensate for trocar displacement or manual instrument repositioning.
- Markerless vision-based registration using geometric cues (e.g., stereo/RGB-D point-cloud alignment or iterative closest point (ICP)-type fitting) avoids physical markers and can be accurate when surfaces are well observed; however, performance can degrade with sparse/low-texture geometry, specular/reflective surfaces, depth dropouts, and partial occlusions, and typically requires careful calibration and sufficient overlap between views [13].
- Learning-based markerless localization/pose estimation (e.g., convolutional networks for keypoint detection or 6-DoF pose regression) reduces reliance on handcrafted features and can generalize across viewpoints, but usually demands larger and more diverse datasets to avoid overfitting and may be sensitive to domain shift and OR artifacts (blood/fluid, smoke, motion blur, harsh shadows), requiring targeted augmentation and external validation [14].
- Non-contact measurement—avoids sterile field contamination and eliminates the need for fiducial attachments;
- Real-time adaptation—updates positioning dynamically as instruments move;
- Improved safety—minimizes excessive force at the trocar site;
- Optimized ergonomics—reduces surgeon effort by automating repetitive alignment actions;
- Enhanced reproducibility—reduces operator-dependent variability in instrument docking.
- A novel markerless, vision-based method for surgical robot localization using YOLO11 and RealSense 3D sensing.
- A complete AI-to-PLC workflow, enabling real-time coordinate extraction, communication, and autonomous closed-loop motion.
- A validated automatic docking framework achieving a submillimeter positioning accuracy (≤0.8 mm) and a significantly reduced alignment time (−42%).
- An integrated OR-oriented system architecture, designed for multi-patient operation, real-time responsiveness, and enhanced surgical safety.
2. Related Works
- For RGB images, PnP, triangulation, and geometric modeling are commonly applied;
- For point clouds, ICP or probabilistic filters are typically used;
- When multimodal data are available, fusion strategies become necessary.
3. Materials and Methods
3.1. ATHENA System Presentation
3.1.1. General Architecture of the Proposed System
- PM—the parallel module of the ATHENA robot;
- Instr.—the tip of the laparoscopic instrument detected by the camera;
- Trocar—the minimally invasive surgical access port, geometrically defined by the axis of its cannula.
3.1.2. The Structure of the ATHENA Surgical Parallel Robot
3.1.3. Design of the Proposed Distributed Software System
- The node representing Python scripts includes 2 scripts developed in Python 3.11.13 using the YOLO11 framework: a script dedicated to the training process that generates a file with the trained model and a script responsible for performing classifications.
- The node corresponding to the C# application running on the .NET 8 platform, organized according to a Model–View–Presenter architecture [53]. The application, implemented using C#, communicates with the Python scripts via the TCP/IP protocol.
- The data storage node contains files essential for the functioning of artificial intelligence algorithms and for interaction with the user.
3.2. Implementation and Integration
3.2.1. Development of Software Applications Using .NET 8
- Pressing the “Connect\Disconnect” button (Figure 5, 1) establishes the connection between the C# application and the Python script using the TCP/IP protocol;
- The “Video Start/Stop” button (Figure 5, 2) is used to start the 3D stereoscopic camera;
- Pressing the “Start/Stop Detect” button (Figure 5, 5) activates or deactivates the object detection process with the 3D stereoscopic camera;
- By pressing the “Save Coords” button (Figure 5, 6), the coordinates of the detected objects are saved in an Excel file;
- By pressing the “Automatic Positioning” button (Figure 5, 7), the surgical robot automatically positions itself so that the surgical instrument can be inserted through the trocar;
- By using the “Object 1” and “Object 2” sliders (Figure 5, 8), it can extend the ends of the line connecting the two detected objects (trocar and instr.);
- By pressing the “Exit” button (Figure 5, 11), the interface can be closed.
- Modbus Connect—pressing this button connects the user interface to the PLC;
- Robot Q Values—displays the positions of the robot’s active joints relative to its coordinate system;
- Remote center of motion—displays the RCM (remote center of motion) position of the spherical mechanism relative to its coordinate system;
- Tool Center Point—displays the position of the end effector in space relative to the robot’s coordinate system;
- Speed—slider used to adjust the speed at which the final effector motions;
- Acceleration—slider used to adjust the maximum permissible acceleration;
- Robot Status—indicates the status of the robot using green and red colors (green indicates normal operation, while red indicates an error or malfunction);
- Power on—pressing this button powers the actuators;
- Homing—button that initiates the referencing procedure;
- Reset—this button resets the active error;
- Emergency Stop—button that triggers the robot shutdown procedure;
- Control—allows you to choose between controlling the robot or the active tool, and the choice between “Haptic” and “SpaceMouse” allows you to choose the peripheral device that controls the robot’s positioning by the user.
3.2.2. Development of the Learning Model
3.2.3. System Integration in the ATHENA Robot
3.2.4. System Data Flow and Experimental Setup
- Data Flow Architecture
- Three-dimensional Camera Acquisition: The Intel RealSense D405 camera captures synchronized RGB and depth streams at 60 fps with a spatial resolution of 1280 × 720 pixels.
- AI Inference Layer: The RGB + depth frames are transmitted to a Python 3.11 environment, where the trained YOLO11m model performs real-time object detection of the trocar, instrument, and parallel mechanism (PM). The resulting 3D coordinates of detected bounding boxes are computed by combining image pixel positions with depth information from the RealSense API.
- Communication Layer (TCP/IP): The processed data—3D coordinates and class labels—are sent through a bidirectional TCP/IP socket to the C#/.NET 8 graphical interface. This ensures asynchronous real-time data transfer between the AI module and the robot control interface, with average latency measured at 67 ms. This value has been determined by decomposing the perception-to-control loop into camera acquisition, neural inference, post-processing, network communication, and PLC execution. Over N = 300 consecutive cycles, the mean end-to-end delay from RGB-D frame reception on the PC to PLC confirmation that the command was applied was 67.0 ± 3.1 ms, comprising 16.7 ± 1.9 ms for camera acquisition (60 fps operation), 14.7 ± 0.6 ms for YOLO11m inference, 5.4 ± 0.4 ms for 3D coordinate extraction and message formatting, 7.8 ± 1.1 ms for the Modbus/TCP write–ack transaction, and 22.4 ± 2.0 ms for PLC execution until the “applied” flag/cycle counter update was observed, yielding a sum of stage means of 67.0 ms.
- Control Interface (C# Application): The interface visualizes detections, logs coordinates, and converts them into robot commands. The coordinate transformation algorithm maps camera-frame coordinates into the robot reference frame.
- PLC Motion Control: The control commands are transmitted through the Modbus TCP protocol to the B&R PLC running a real-time operating system (RTOS). The PLC manages stepper motor drives and enforces safety constraints (velocity limits, emergency stop).
- Robot Execution: The ATHENA parallel robot performs smooth motion following a trapezoidal velocity and acceleration profile, aligning its flange with the detected surgical instrument.
- Experimental Setup
- Number of 3D Samples: A total of 1200 3D frames were recorded, including 1000 frames used for training and 200 frames for validation and testing. Each frame contained annotations for three object classes: trocar, instrument, and PM.
- Lighting Conditions: Experiments were conducted under three illumination levels to simulate realistic operating room variability:
- ○
- Bright (800–1000 lux)—standard laboratory lighting;
- ○
- Moderate (400–600 lux)—typical endoscopic lighting conditions;
- ○
- Low (150–250 lux)—simulated dim operating room scenario.
- Camera–Object Distance and Angles: The camera was mounted at 0.6–0.8 m from the robot workspace at adjustable tilt angles of 20°, 30°, and 45° relative to the instrument axis.
- Hardware Configuration:
- ○
- CPU: AMD Ryzen 7 9700X (16 cores, 5.7 GHz).
- ○
- GPU: NVIDIA RTX 5080 (16 GB VRAM).
- ○
- RAM: 96 GB DDR5 6400 MHz.
- ○
- Operating System: Windows 11 Pro 64-bit.
- ○
- Storage: 2 TB NVMe SSD.
- ○
- Frame Rate: 60 frames per second.
- ○
- Average Inference Time: 14.7 ms per frame (YOLO11m on GPU).
4. Experimental Results
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tan, A.; Ashrafian, H.; Scott, A.J.; Mason, S.E.; Harling, L.; Athanasiou, T.; Darzi, A. Robotic surgery: Disruptive innovation or unfulfilled promise? A systematic review and meta-analysis of the first 30 years. Surg. Endosc. 2016, 30, 4330–4352. [Google Scholar] [CrossRef]
- George, E.I.; Brand, T.C.; LaPorta, A.; Marescaux, J.; Satava, R.M. Origins of Robotic Surgery: From Skepticism to Standard of Care. JSLS J. Soc. Laparoendosc. Surg. 2018, 22, e2018-00039. [Google Scholar] [CrossRef] [PubMed]
- Morrell, A.L.G.; Morrell, A.C.; Mendes, J.M.F.; Tustumi, F.; Silva, L.G.O.; Morrell, A. The history of robotic surgery and its evolution: When illusion becomes reality. Rev. Do Colégio Bras. De Cir. 2021, 48, e20202798. [Google Scholar] [CrossRef]
- Reddy, K.; Gharde, P.; Tayade, H.; Patil, M.; Reddy, L.S.; Surya, D. Advancements in Robotic Surgery: A Comprehensive Overview of Current Utilizations and Upcoming Frontiers. Cureus 2023, 15, e50415. [Google Scholar] [CrossRef] [PubMed]
- Ashrafian, H.; Clancy, O.; Grover, V.; Darzi, A. The evolution of robotic surgery: Surgical and anaesthetic aspects. BJA Br. J. Anaesth. 2017, 119, 72–84. [Google Scholar] [CrossRef] [PubMed]
- Iftikhar, M.; Saqib, M.; Zareen, M.; Mumtaz, H. Artificial intelligence: Revolutionizing robotic surgery: Review. Ann. Med. Surg. 2024, 86, 5401–5409. [Google Scholar] [CrossRef] [PubMed]
- Kamtam, D.N.; Shrager, J.B.; Malla, S.D.; Lin, N.; Cardona, J.J.; Kim, J.J.; Hu, C. Deep learning approaches to surgical video segmentation and object detection: A scoping review. Comput. Biol. Med. 2025, 194, 110482. [Google Scholar] [CrossRef]
- Aghazadeh, F.; Zheng, B.; Tavakoli, M.; Rouhani, H. Motion Smoothness-Based Assessment of Surgical Expertise: The Importance of Selecting Proper Metrics. Sensors 2023, 23, 3146. [Google Scholar] [CrossRef] [PubMed]
- Sánchez-Margallo, J.A.; Sánchez-Margallo, F.M.; Pagador Carrasco, J.B.; Oropesa García, I.; Gómez Aguilera, E.J.; Moreno del Pozo, J. Usefulness of an Optical Tracking System in Laparoscopic Surgery for Motor Skills Assessment. Cirugía Española 2014, 92, 421–428. [Google Scholar] [CrossRef][Green Version]
- Taleb, A.; Guigou, C.; Leclerc, S.; Lalande, A.; Bozorg-Grayeli, A. Image-to-Patient Registration in Computer-Assisted Surgery of Head and Neck: State-of-the-Art, Perspectives, and Challenges. J. Clin. Med. 2023, 12, 5398. [Google Scholar] [CrossRef]
- Lugez, E.; Sadjadi, H.; Pichora, D.R.; Ellis, R.E.; Akl, S.G.; Fichtinger, G. Electromagnetic Tracking in Surgical and Interventional Environments: Usability Study. Int. J. Comput. Assist. Radiol. Surg. 2015, 10, 253–262. [Google Scholar] [CrossRef] [PubMed]
- Kral, F.; Puschban, E.J.; Riechelmann, H.; Freysinger, W. Comparison of Optical and Electromagnetic Tracking for Navigated Lateral Skull Base Surgery. Int. J. Med. Robot. 2013, 9, 247–252. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Olson, E. AprilTag 2: Efficient and robust fiducial detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 4193–4198. [Google Scholar] [CrossRef]
- Hein, J.; Seibold, M.; Bogo, F.; Farshad, M.; Pollefeys, M.; Fürnstahl, P.; Navab, N. Towards markerless surgical tool and hand pose estimation. Int. J. Comput. Assist. Radiol. Surg. 2021, 16, 799–808. [Google Scholar] [CrossRef] [PubMed]
- Pan, X.; Bi, M.; Wang, H.; Ma, C.; He, X. DBH-YOLO: A Surgical Instrument Detection Method Based on Feature Separation in Laparoscopic Surgery. Int. J. Comput. Assist. Radiol. Surg. 2024, 19, 2215–2225. [Google Scholar] [CrossRef]
- Peng, J.; Chen, Q.; Kang, L.; Jie, H.; Han, Y. Autonomous Recognition of Multiple Surgical Instrument Tips Based on Arrow OBB-YOLO Network. IEEE Trans. Instrum. Meas. 2022, 71, 1–13. [Google Scholar] [CrossRef]
- Picozzi, P.; Nocco, U.; Labate, C.; Gambini, I.; Puleo, G.; Silvi, F.; Pezzillo, A.; Mantione, R.; Cimolin, V. Advances in Robotic Surgery: A Review of New Surgical Platforms. Electronics 2024, 13, 4675. [Google Scholar] [CrossRef]
- Williamson, T.; Song, E.E. Robotic Surgery Techniques to Improve Traditional Laparoscopy. JSLS J. Soc. Laparoendosc. Surg. 2022, 26, e2022-00002. [Google Scholar] [CrossRef]
- Tucan, P.; Vaida, C.; Horvath, D.; Caprariu, A.; Burz, A.; Gherman, B.; Iakab, S.; Pisla, D. Design and Experimental Setup of a Robotic Medical Instrument for Brachytherapy in Non-Resectable Liver Tumors. Cancers 2022, 14, 5841. [Google Scholar] [CrossRef]
- Vaida, C.; Plitea, N.; Carbone, G.; Birlescu, I.; Ulinici, I.; Pisla, A.; Pisla, D. Innovative development of a spherical parallel robot for upper limb rehabilitation. Int. J. Mech. Robot. Syst. 2018, 4, 256. [Google Scholar] [CrossRef]
- Zhang, Z.; Meng, Q.; Cui, Z.; Yao, M.; Shao, Z.; Tao, B. Machine Learning Applications in Parallel Robots: A Brief Review. Machines 2025, 13, 565. [Google Scholar] [CrossRef]
- Paracchini, S.; Taliento, C.; Pellecchia, G.; Tius, V.; Tavares, M.; Borghi, C.; Buda, A.A.; Bartoli, A.; Bourdel, N.; Vizzielli, G. Artificial Intelligence in the Operating Room: A Systematic Review of AI Models for Surgical Phase, Instruments and Anatomical Structure Identification. Acta Obstet. Gynecol. Scand. 2025, 104, 2054–2064. [Google Scholar] [CrossRef] [PubMed]
- Ward, T.M.; Mascagni, P.; Ban, Y.; Rosman, G.; Padoy, N.; Meireles, O.; Hashimoto, D.A. Computer vision in surgery. Surgery 2021, 169, 1253–1256. [Google Scholar] [CrossRef]
- Gumbs, A.A.; Grasso, V.; Bourdel, N.; Croner, R.; Spolverato, G.; Frigerio, I.; Illanes, A.; Hilal, M.A.; Park, A.; Elyan, E. The Advances in Computer Vision That Are Enabling More Autonomous Actions in Surgery: A Systematic Review of the Literature. Sensors 2022, 22, 4918. [Google Scholar] [CrossRef] [PubMed]
- Luongo, F.; Hakim, R.; Nguyen, J.H.; Anandkumar, A.; Hung, A.J. Deep learning-based computer vision to recognize and classify suturing gestures in robot-assisted surgery. Surgery 2021, 169, 1240–1244. [Google Scholar] [CrossRef] [PubMed]
- Zang, C.; Turkcan, M.K.; Narasimhan, S.; Cao, Y.; Yarali, K.; Xiang, Z.; Szot, S.; Ahmad, F.; Choksi, S.; Bitner, D.P.; et al. Surgical Phase Recognition in Inguinal Hernia Repair-AI-Based Confirmatory Baseline and Exploration of Competitive Models. Bioengineering 2023, 10, 654. [Google Scholar] [CrossRef]
- Jiang, K.; Pan, S.W.; Yang, L.; Yu, J.; Lin, Y.; Wang, H.Q. Surgical Instrument Recognition Based on Improved YOLOv5. Appl. Sci. 2023, 13, 11709. [Google Scholar] [CrossRef]
- Jearanai, S.; Wangkulangkul, P.; Sae-Lim, W.; Cheewatanakornkul, S. Development of a deep learning model for safe direct optical trocar insertion in minimally invasive surgery: An innovative method to prevent trocar injuries. Surg. Endosc. 2023, 37, 7295–7304. [Google Scholar] [CrossRef]
- Rus, G.; Andras, I.; Vaida, C.; Crisan, N.; Gherman, B.; Radu, C.; Tucan, P.; Iakab, S.; Hajjar, N.A.; Pisla, D. Artificial Intelligence-Based Hazard Detection in Robotic-Assisted Single-Incision Oncologic Surgery. Cancers 2023, 15, 3387. [Google Scholar] [CrossRef]
- Azizian, M.; Khoshnam, M.; Najmaei, N.; Patel, R.V. Visual Servoing in Medical Robotics: A Survey Part I: Endoscopic Direct Vision Imaging—Techniques and Applications. Int. J. Med. Robot. Comput. Assist. Surg. 2014, 10, 263–274. [Google Scholar] [CrossRef]
- Pandya, A.; Reisner, L.A.; King, B.; Lucas, N.; Composto, A.; Klein, M.; Ellis, R.D. A Review of Camera Viewpoint Automation in Robotic and Laparoscopic Surgery. Robotics 2014, 3, 310–329. [Google Scholar] [CrossRef]
- Maier-Hein, L.; Vedula, S.S.; Speidel, S.; Navab, N.; Kikinis, R.; Park, A.; Eisenmann, M.; Feussner, H.; Forestier, G.; Giannarou, S.; et al. Surgical Data Science for Next-Generation Interventions. Nat. Biomed. Eng. 2017, 1, 691–696. [Google Scholar] [CrossRef]
- Ahmed, F.A.; Yousef, M.; Ahmed, M.A.; Ali, H.O.; Mahboob, A.; Ali, H.; Shah, Z.; Aboumarzouk, O.; Al Ansari, A.; Balakrishnan, S. Deep Learning for Surgical Instrument Recognition and Segmentation in Robotic-Assisted Surgeries: A Systematic Review. Artif. Intell. Rev. 2025, 58, 1. [Google Scholar] [CrossRef]
- Allan, M.; Ourselin, S.; Hawkes, D.J.; Kelly, J.D.; Stoyanov, D. 3-D Pose Estimation of Articulated Instruments in Robotic Minimally Invasive Surgery. IEEE Trans. Med. Imaging 2018, 37, 1204–1213. [Google Scholar] [CrossRef]
- Doignon, C.; Nageotte, F.; Maurin, B.; Krupa, A. Pose Estimation and Feature Tracking for Robot Assisted Surgery with Medical Imaging. In Unifying Perspectives in Computational and Robot Vision; Kragic, D., Kyrki, V., Eds.; Lecture Notes in Electrical Engineering; Springer: Boston, MA, USA, 2008; Volume 8, pp. 79–101. [Google Scholar]
- Hasan, M.K.; Calvet, L.; Rabbani, N.; Bartoli, A. Detection, Segmentation, and 3D Pose Estimation of Surgical Tools Using Convolutional Neural Networks and Algebraic Geometry. Med. Image Anal. 2021, 70, 101994. [Google Scholar] [CrossRef] [PubMed]
- Habert, S.; Eck, U.; Fallavollita, P.; Parent, S.; Navab, N.; Cheriet, F. Application of an RGBD Augmented C-Arm for Minimally Invasive Scoliosis Surgery Assistance. Healthc. Technol. Lett. 2017, 4, 179–183. [Google Scholar] [CrossRef] [PubMed]
- Simpson, A.L.; Ma, B.; Vasarhelyi, E.M.; Borschneck, D.P.; Ellis, R.E.; Stewart, A.J. Computation and Visualization of Uncertainty in Surgical Navigation. Int. J. Med. Robot. Comput. Assist. Surg. 2014, 10, 332–343. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Z.; Cai, T.; Chang, F.; Cheng, X. Real-Time Surgical Instrument Detection in Robot-Assisted Surgery Using a Convolutional Neural Network Cascade. Healthc. Technol. Lett. 2019, 6, 275–279. [Google Scholar] [CrossRef]
- ISO 14971:2019; Medical Devices—Application of Risk Management to Medical Devices. ISO: Geneva, Switzerland, 2019.
- ISO 13485:2016; Medical Devices—Quality Management Systems—Requirements for Regulatory Purposes. ISO: Geneva, Switzerland, 2016.
- IEC 60601-1:2005+AMD1:2012+AMD2:2020 CSV; Medical Electrical Equipment—Part 1: General Requirements for Basic Safety and Essential Performance. IEC: Geneva, Switzerland, 2020.
- IEC 60601-1-2:2014+AMD1:2020 CSV; Medical Electrical Equipment—Part 1–2: General Requirements for Basic Safety and Essential Performance—Collateral Standard: Electromagnetic Disturbances—Requirements and Tests. IEC: Geneva, Switzerland, 2020.
- IEC 80601-2-77:2019+AMD1:2023 CSV; Medical Electrical Equipment—Part 2–77: Particular Requirements for the Basic Safety and Essential Performance of Robotically Assisted Surgical Equipment. IEC: Geneva, Switzerland, 2023.
- IEC 62304:2006+AMD1:2015 CSV; Medical Device Software—Software Life Cycle Processes. IEC: Geneva, Switzerland, 2015.
- IEC 62366-1:2015+AMD1:2020 CSV; Medical Devices—Part 1: Application of Usability Engineering to Medical Devices. IEC: Geneva, Switzerland, 2020.
- Intel RealSense Camera D405. Available online: www.realsenseai.com/products/stereo-depth-camera-d405/ (accessed on 15 May 2025).
- Vaida, C.; Gherman, B.; Tucan, P.; Birlescu, I.; Chablat, D.; Pisla, D. Parallel Robotic System for Pancreatic Minimally Invasive Surgery. Patent A/00116/20, March 2024. [Google Scholar]
- Vaida, C.; Birlescu, I.; Gherman, B.; Condurache, D.; Chablat, D.; Pisla, D. An analysis of higher-order kinematics formalisms for an innovative surgical parallel robot. Mech. Mach. Theory 2025, 209, 105986. [Google Scholar] [CrossRef]
- Tucan, P.; Ciocan, A.; Gherman, B.; Radu, C.; Vaida, C.; Hajjar, N.A.; Chablat, D.; Pisla, D. Design Optimization of a Parallel Robot for Laparoscopic Pancreatic Surgery Using a Genetic Algorithm. Appl. Sci. 2025, 15, 4383. [Google Scholar] [CrossRef]
- Iordan, A.E.; Covaciu, F. Improving Design of a Triangle Geometry Computer Application using a Creational Pattern. Acta Tech. Napoc. Appl. Math. Mech. Eng. 2020, 63, 73–78. [Google Scholar]
- Sukarsa, I.; Piarsa, I.; Putra, I. Application of MVP Architecture in Developing Android-Based Seminar Ticket Booking Applications. J. RESTI 2020, 4, 513–520. [Google Scholar] [CrossRef]
- Tazin, A.; Kokar, M. UML Class Diagram Classification Using Category Theory. J. Softw. Eng. Appl. 2025, 18, 217–248. [Google Scholar] [CrossRef]
- Ramos, L.; Sappa, A. A comprehensive analysis of YOLO architectures for tomato leaf disease identification. Sci. Rep. 2025, 15, 26890. [Google Scholar] [CrossRef] [PubMed]
- Chen, F.; Zhang, Y.; Fu, L.; Hua, R.; Zhang, Q.; Bi, S. A Comparative Review of the Next-Generation YOLO Models: YOLOv10 and YOLO11. J. Comput. Sci. Artif. Intell. 2025, 3, 1–6. [Google Scholar] [CrossRef]
- Ultralytics YOLO11: Real-Time Object Detection Model. Available online: https://docs.ultralytics.com/models/yolo11/ (accessed on 15 May 2025).
- Khanam, R.; Hussain, M. YOLOv11: An Overview of the Key Architectural Enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- He, L.H.; Zhou, Y.Z.; Liu, L.; Zhang, Y.Q.; Ma, J.H. Research on the directional bounding box algorithm of YOLO11 in tailings pond identification. Measurement 2025, 253, 117674. [Google Scholar] [CrossRef]
- Lee, Y.-S.; Patil, M.; Kim, J.; Seo, Y.B.; Ahn, D.; Kim, G.-D. Hyperparameter Optimization for Tomato Leaf Disease Recognition Based on YOLOv11m. Plants 2025, 14, 653. [Google Scholar] [CrossRef]
- Teng, H.; Wang, Y.; Li, W.; Chen, T.; Liu, Q. Advancing Rice Disease Detection in Farmland with an Enhanced YOLOv11 Algorithm. Sensors 2025, 25, 3056. [Google Scholar] [CrossRef]
- Muscalagiu, I.; Popa, H.E.; Negru, V. Improving the Performances of Asynchronous Search Algorithms in Scale-Free Networks using the Nogoood Processor Technique. Comput. Inform. 2015, 34, 254–274. [Google Scholar]
- Panoiu, M.; Ivascanu, P.; Panoiu, C. Analysis of Operating Regimes and THD Forecasting in Steelmaking Plant Power Systems using Advanced Neural Architectures. Mathematics 2025, 13, 3692. [Google Scholar] [CrossRef]
- Pisla, D.; Gherman, B.; Tucan, P.; Pisla, A.; Al Hajjar, N.; Cailean, A.; Vaida, C. On the accuracy assessment of a parallel robot for the minimally invasive cancer treatment. J. Eng. Sci. Innov. 2024, 9, 253–264. [Google Scholar] [CrossRef]
- Eggert, D.; Lorusso, A.; Fisher, R. Estimating 3-D rigid body transformations: A comparison of four major algorithms. Mach. Vis. Appl. 1997, 9, 272–290. [Google Scholar] [CrossRef]
















| Hyperparameter | Search Domain | Used Value |
|---|---|---|
| epochs | {500, 600, 700, 800, 900, 1000} | 900 |
| batch | {8, 16, 32} | 8 |
| optimizer | {“Adam”, “AdamW”} | “Adam” |
| initial learning rate | {0.001, 0.0015, 0.002, 0.0025, 0.003} | 0.0015 |
| Model | Metric | Iteration 1 | Iteration 2 | Iteration 3 | Iteration 4 | Iteration 5 | Average | SD |
|---|---|---|---|---|---|---|---|---|
| YOLO11m | mAP | 0.99500 | 0.99421 | 0.99500 | 0.99500 | 0.99421 | 0.99468 | 0.00043 |
| Precision | 0.99194 | 0.98288 | 0.97851 | 0.99542 | 0.99048 | 0.98785 | 0.00694 | |
| Recall | 0.99445 | 0.98333 | 0.97220 | 0.99692 | 0.97760 | 0.98490 | 0.01064 | |
| F1-score | 0.99319 | 0.98310 | 0.97534 | 0.99617 | 0.98400 | 0.98636 | 0.00837 | |
| YOLO10m | mAP | 0.94313 | 0.92652 | 0.94823 | 0.92714 | 0.91788 | 0.93258 | 0.01264 |
| Precision | 0.88011 | 0.90442 | 0.86409 | 0.87306 | 0.81917 | 0.86817 | 0.03122 | |
| Recall | 0.93333 | 0.87833 | 0.91862 | 0.90113 | 0.93661 | 0.91360 | 0.02420 | |
| F1-score | 0.90594 | 0.89118 | 0.89052 | 0.88687 | 0.87396 | 0.88970 | 0.01143 | |
| YOLO9m | mAP | 0.84884 | 0.85594 | 0.81359 | 0.82295 | 0.80324 | 0.82891 | 0.02268 |
| Precision | 0.78451 | 0.76561 | 0.74764 | 0.83242 | 0.80768 | 0.78757 | 0.03355 | |
| Recall | 0.76988 | 0.84440 | 0.85007 | 0.80436 | 0.86667 | 0.82708 | 0.03932 | |
| F1-score | 0.77713 | 0.80308 | 0.79557 | 0.81815 | 0.83614 | 0.80601 | 0.02240 | |
| YOLO8m | mAP | 0.82057 | 0.76631 | 0.71485 | 0.76554 | 0.75876 | 0.76521 | 0.03756 |
| Precision | 0.75652 | 0.69234 | 0.77212 | 0.76396 | 0.77687 | 0.75236 | 0.03444 | |
| Recall | 0.78333 | 0.71561 | 0.75759 | 0.78333 | 0.81667 | 0.77131 | 0.03754 | |
| F1-score | 0.76969 | 0.70378 | 0.76479 | 0.77352 | 0.79627 | 0.76161 | 0.03451 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Covaciu, F.; Gherman, B.; Al Hajjar, N.; Zima, I.; Popa, C.; Pusca, A.; Ciocan, A.; Vaida, C.; Iordan, A.-E.; Tucan, P.; et al. Deep Learning Computer Vision-Based Automated Localization and Positioning of the ATHENA Parallel Surgical Robot. Electronics 2026, 15, 474. https://doi.org/10.3390/electronics15020474
Covaciu F, Gherman B, Al Hajjar N, Zima I, Popa C, Pusca A, Ciocan A, Vaida C, Iordan A-E, Tucan P, et al. Deep Learning Computer Vision-Based Automated Localization and Positioning of the ATHENA Parallel Surgical Robot. Electronics. 2026; 15(2):474. https://doi.org/10.3390/electronics15020474
Chicago/Turabian StyleCovaciu, Florin, Bogdan Gherman, Nadim Al Hajjar, Ionut Zima, Calin Popa, Alexandru Pusca, Andra Ciocan, Calin Vaida, Anca-Elena Iordan, Paul Tucan, and et al. 2026. "Deep Learning Computer Vision-Based Automated Localization and Positioning of the ATHENA Parallel Surgical Robot" Electronics 15, no. 2: 474. https://doi.org/10.3390/electronics15020474
APA StyleCovaciu, F., Gherman, B., Al Hajjar, N., Zima, I., Popa, C., Pusca, A., Ciocan, A., Vaida, C., Iordan, A.-E., Tucan, P., Chablat, D., & Pisla, D. (2026). Deep Learning Computer Vision-Based Automated Localization and Positioning of the ATHENA Parallel Surgical Robot. Electronics, 15(2), 474. https://doi.org/10.3390/electronics15020474

