1. Introduction
In recent years, data collection and analysis have become increasingly central in all sectors to support strategic decision-making processes [
1,
2]. The concept of data-driven decision-making has become increasingly popular in all sectors, including healthcare [
3,
4,
5]. As a result, the concept of data-driven healthcare is emerging as a crucial approach to improve patient care and optimize resource allocation, relying on data and evidence to guide clinical and organizational choices [
6,
7].
To exploit the full potential of health data, it is essential that they are properly structured and integrated [
8,
9,
10]. Fragmented or poorly organized information risks being underutilized, thus limiting its strategic value. Today, clinical data is not only generated by medical records or manually entered administrative systems, but an increasing amount of data comes directly from medical devices and equipment [
11,
12]. However, to be useful, this data needs to be carefully linked to patient information and hospital information systems.
Several studies have looked into combining technical and clinical data in the operating room, mainly to recognize workflows and improve systems. For example, Sharghi et al. [
13] suggested a method to identify surgical activities during robotic-assisted procedures using data from time-of-flight sensors and robotic system logs. Jamal and Mohareri [
14] in 2022 developed a self-supervised method to merge audio, video, and robotic data to study OR processes.
Despite these advancements, the literature lacks specific examples of matching robotic device logs with hospital operating room management systems. This is especially true in real clinical settings where there are no common unique identifiers between the systems. This gap is important because data from surgical robots, like console start times and durations, often do not connect with administrative OR records. This disconnect limits their usefulness for quality monitoring and improving operations.
This study presents a project carried out by the Clinical Engineering Department of the ASST Grande Ospedale Metropolitano Niguarda, aimed at integrating data from robotic surgical systems (Da Vinci®) with data from the operating room management system. In these two datasets, there is no unique identifier linking them together. For this reason, the main objective was to automatically identify robotic surgical procedures in the operating room (OR) management system that, until recently, could only be tracked through manually entered fields, such as the designation of the type of procedure (e.g., endoscopic, robotic, …) or the use of the code 00.39 to indicate robotic assistance. As this information was entered manually by the operating room staff, it was subject to a significant margin of error and inconsistency.
The proposed integration approach uses objective data extracted from the robotic systems, such as procedure dates, console start times, and total duration of console use, to match and accurately identify robotic procedures among all procedures recorded in the hospital’s OR management system. This allows not only a more reliable identification, but also an enrichment of clinical and operational data through the fusion of technical parameters and administrative records.
In addition, robotic data, being automatically generated by the system itself, provide objective metrics—unaffected by human error—such as the precise duration of console use. In contrast, durations recorded in the hospital system may be less accurate due to manual input. This integration is therefore crucial to improve data quality and enable more accurate monitoring and evaluation of robotic surgical activity.
This project shows how data-driven healthcare can lead to real improvements. By combining robotic system logs with operating room management data, our algorithm automatically and accurately identifies robot-assisted procedures. For example, it helps hospital managers track robot usage and procedure lengths without depending on unreliable manual entries. This better data accuracy helps with scheduling, cuts down on operating room downtime, and offers dependable metrics for assessing surgical performance. Ultimately, this contributes to better resource use and higher quality patient care.
Before this project, verifying the correct identification of the operations performed with the robot required a manual comparison between the data from the equipment and the information entered in the operating theatre management system. This process depended on human intervention, making it time-consuming and prone to errors. Therefore, the development of an automatic system that allows autonomous identification of the operations performed with the surgical robots and the integration of the data from the two databases makes it possible to considerably reduce the working time by making the entire process more efficient.
Ultimately, this project aims to facilitate the comprehensive integration of technical and clinical data to improve the management and evaluation of robotic surgery. Given the higher costs typically associated with robotic procedures compared to traditional surgical methods, access to detailed and accurate data can support more informed decisions, optimize resource utilization, and potentially contribute to cost containment through increased operational efficiency.
This project introduces a new method that uses time-based matching algorithms to combine device-level robotic data with hospital administrative records, even when there are no shared identifiers. It provides a practical and repeatable solution tested in a real hospital setting. This has direct benefits for improving data quality, efficiency, and management of robotic surgical activities. By merging technical and clinical records, it also allows for new ways to compare performance, analyze costs, and oversee clinical operations, addressing a significant gap between engineering data and hospital information systems.
2. Materials and Methods
To address the problem mentioned before, an algorithm capable of automatically linking robotic procedures recorded by Da Vinci
® surgical systems (Intuitive Surgical, Sunnyvale, CA, USA) with the corresponding entries in the hospital’s operating room management system, Ormaweb [
15] (Dedalus, Firenze, Italy) was developed. The aim was to achieve an accurate and reproducible match, enabling the integration of clinical and technical data between the various systems and ultimately supporting analyses of procedure types, resource utilization, and performance of the robotic platforms.
2.1. Data Source
Two main data sources were used:
Operating room data from Ormaweb, that in Italy is a well-diffused OR management system. This database contains structured records of all surgical procedures performed in the hospital. Each row represents one surgical procedure. This dataset contains various data such as the following:
The surgery identification number and the patient identification number allow this dataset to be linked to other hospital information systems.
Information about the operating theatre and operating block where the operation is performed.
The surgical specialty and type of surgery together with the DRG (Diagnosis Related Group) reimbursement code
Information about the operating theatre staff.
Information about the date of surgery and all operating times (entry into the operating room, start of anesthetic preparation, patient ready, start of surgery, end of surgery, exit from the operating room).
Records of robotic procedures: These were exported from the da Vinci® robotic surgical systems and include the serial number of the system, the date and local time of each procedure, and the total duration of the procedure in minutes.
Thus, the two data sources do not have a common key to match them easily. At the moment, creating a unique identifier at the source was not possible because the robotic system and the hospital management system are separate, proprietary platforms that do not share communication protocols. The robotic logs do not include hospital-specific IDs, and the OR system does not directly receive data generated by the device. Implementing a common identifier would have needed complicated coordination and system changes involving many stakeholders. As a result, integration was performed through post hoc matching based on time and context.
The data extracted from the operating room management system and the surgical robot cover the period from 1 January 2023 to 30 June 2024.
During the data extraction period, some missing or incomplete timestamps were observed in fields such as the start of anesthetic preparation and patient ready. However, these variables were not used in the algorithm development to avoid introducing bias or errors. Instead, the algorithm relied exclusively on consistently available and reliable time fields to ensure robustness in the matching process.
2.2. Data Pre-Processing
Both datasets were imported and processed using the Python 3.10 programming language. For the robotic dataset, a new column was created combining the date and procedure start time to generate a single reference timestamp for each case (Local Procedure Start DateTime). In the Ormaweb dataset, the first step was to convert the data type of the date and time fields to “datetime”, since they were strings in the original dataset.
As a second aspect, the algorithm took into account cases in which some fields had been left blank because the operating room staff had not filled them in during the procedure. Therefore, auxiliary columns were constructed to define the most reliable available time window during which a procedure could have taken place. These were filled in by prioritizing the available timestamps in a hierarchical order (e.g., entry to the block, entry to the room, start time of the procedure).
2.3. Matching Algorithm
The core of the methodology consists of a deterministic algorithm (
Figure 1) developed in Python. For each robotic procedure represented by a row in the robotic dataset, the algorithm attempted to identify a matching intervention within Ormaweb by applying a stepwise filtering process. The steps are described below:
Row selection by date: for a given robotic row, only surgical procedures that occurred on the same date as the robotic registry were considered.
Row filter based on the time window: the start time of the robotic procedure on the console had to fall within the estimated entry and exit times recorded in the operating suite.
Row selection based on the duration: The robotic procedure duration must be shorter than the time window of the candidate hospital procedure.
Row filter based on the operating room number: if more than one potential match was found after these steps, the algorithm applied additional a filter based on known associations between robotic systems’ serial number and surgical rooms:
System SK5054 was associated with OR 08.
System SK7255 was associated with OR 10.
System SK5389 was associated with OR 12.
For example, if the robotic dataset row under analysis has the serial number SK5054, only the remaining records from the operating room management system with operating room number 08 are considered valid candidates.
In the rare cases where an ambiguity remained after all filters had been applied, fallback logic was used to refine the selection. This involved comparing robotic data with narrower intra-operative time windows, rather than with wider entry/exit times. This means that if multiple candidate matches still remain after the initial filters, the algorithm applies increasingly strict temporal constraints. First, it retains only those records from the operating room management system in which the start time of the robotic procedure falls within the time interval between the patient’s entry into and exit from the operating room. If ambiguities persist, a further restriction is applied by narrowing the window to the time between the recorded start and end of the surgical procedure. This stepwise refinement ensures that the matching criteria become progressively more selective, reducing the likelihood of incorrect associations.
Only if a single unique match was identified and the associated operating theatre was consistent with expectations was the match accepted.
If the algorithm could not find any matches using the filters listed above, it would insert a note indicating that the match did not take place. The operator must then manually match the remaining transactions. This may happen because in some cases the data entered manually in the room management system may be incorrect and therefore in no way match the data recorded by the robot.
The developed algorithm was applied to a dataset consisting of 14,500 surgical procedures recorded in the operating room management system (Ormaweb) from 1 January 2023 to 30 June 2024 and 1372 procedure logs extracted from the Da Vinci® robotic systems.
At present, these are the only available data on which the algorithm can be tested; however, in the future, it will be possible to evaluate the algorithm with additional data depending on the number of robotic procedures performed by the hospital.
2.4. Outcome Metrics
To evaluate the performance of the algorithm, two main metrics were considered:
The total number of robotic procedures not matched to an entry in the Ormaweb database.
The accuracy of matching between the two datasets, determined by checking the match between surgeries matched by the algorithm and those manually matched by two people.
3. Results
The matching algorithm successfully matched 1362 robotic procedures with corresponding entries in the Ormaweb database, achieving a 99.27% match rate.
To validate the accuracy of the algorithm, the matched cases were manually reviewed by two independent operators, prior to the development of the algorithm. The reviewers were blinded to the algorithm’s outcomes, ensuring that the manual matching was not influenced by its results. All 1362 matched procedures were confirmed as correct, achieving a 100% accuracy rate compared to manual validation. This result demonstrates the robustness and reliability of the matching strategy, which is mainly based on the synchronization of timestamps and the use of common metadata, such as operating theatre identifiers and duration of surgery.
The fallback logic—applying stricter temporal constraints to resolve ambiguous matches—was used in only 2.3% of the cases, indicating that in the vast majority of instances the algorithm was able to confidently identify matches using the primary filtering steps.
Only 10 robotic diaries (0.73%) remained unpaired. In most of these cases, the inability to establish a match was attributable to missing or inconsistent data, such as incomplete timestamps or procedural details in the operating room management system. These mismatched cases are flagged for manual review and resolution by clinical engineering staff, ensuring that no robotic procedure is excluded from subsequent analysis. This approach balances automation with a final manual validation step to ensure complete data integration.
The entire matching process was performed in less than 3 s on a standard personal computer (Intel Core i7, 16 GB RAM), highlighting the scalability and practical applicability of the method. This level of performance suggests that the algorithm can be integrated into real-time or near real-time data pipelines for continuous monitoring of robotic surgery. Furthermore, the low computational cost makes it suitable for regular use by hospital staff, without the need for specialized hardware or technical expertise.
4. Discussion
The integration of clinical and technical data represents one of the most significant challenges in the digital transformation of healthcare. The project described in this study addressed a concrete and recurring problem, namely the difficulty of reliably identifying robot-assisted surgical procedures in OR management systems. This problem is common to many clinical settings where data from medical devices are not directly interoperable with hospital administrative and management systems [
16,
17].
The results obtained demonstrate the effectiveness of a deterministic approach, based on the synchronization of timestamps and metadata, for automatically linking logs generated by Da Vinci
® robotic systems with records in the Ormaweb platform. With a success rate of 99.27% and 100% accuracy compared to manual validation, the method proved to be highly robust. These results are in line with the literature [
18,
19], which highlights that time synchronization, and the use of rules based on contextual metadata are effective strategies for record linkage in healthcare.
Another relevant aspect that emerged from the study concerns data quality and reliability. Data generated automatically by surgical robots provide objective metrics, such as the precise duration of console use, which are not subject to human error, unlike manual records in hospital management systems. This allows for a more accurate measurement of surgical activities and provides a stronger basis for operational analysis, performance evaluations and strategic decisions.
The adoption of an automated algorithm for matching procedures also allows for a significant reduction in work time for clinical and technical staff. In the past, such verification required time-consuming and error-prone manual checks, whereas now it can be completed in seconds with simple computational processing. This approach aligns with the principles of “digital health” and “learning health systems”, where automation and intelligent use of data enable improved decision-making in healthcare [
20,
21].
Moreover, the matching process took under 3 s on a regular personal computer, showing high efficiency. When it comes to scalability, the straightforward and predictable nature of the algorithm indicates that it can manage much larger datasets or be expanded to several hospitals without drastically increasing computational time. However, as the amount of data increases, we may need optimization methods or more powerful hardware to keep near real-time performance, especially if it is used in continuous monitoring systems across different institutions. Future work could look into these points to ensure reliability and efficiency at a larger scale.
However, the study also highlights some limitations. The failure of matching in about 0.73% of the cases was attributed to incomplete or inconsistent data, particularly in the manually entered records in the Ormaweb system. This highlights the need to improve data quality at source and underlines the importance of supplementing the automatic process with a manual review phase in ambiguous cases. To reduce such inconsistencies, especially with missing or incorrectly entered timestamps, future improvements could include specific staff training and updated data entry protocols. Although these actions are not currently planned, they would be essential to improve overall data reliability and the effectiveness of automated integration methods.
A more conservative approach was adopted, avoiding automatic matching of interventions without an exact correspondence, in order to prevent incorrect associations that could result from relaxing the matching criteria. In such cases, the task of manually linking the two datasets is left to the operator. To better assess this trade-off, we tested a version of the algorithm with relaxed temporal constraints, particularly during the fallback logic phase. This led to 24 false positives, corresponding to 1.75% of total robotic procedures. Based on this result, the conservative strategy was confirmed as preferable, ensuring high precision and minimizing the risk of incorrect matches. Although this rate may appear modest, the implications of misclassifying surgical procedures in clinical data integration are significant, especially when such data feed into analyses of resource utilization, performance, or outcome evaluation, or are used for reimbursement claims within the national health system.
A further limitation of this study is that integration does not take place directly between the two systems, but it is necessary to download the Excel files containing all the data and then integrate them using the code developed in this study. In the future, there should be a direct integration, i.e., the robot can send data directly to the operating theatre management system, using a common ID. However, this direct data transmission comes with several challenges. It needs strong interoperability standards and secure communication protocols to protect data integrity and patient privacy. Among the available standards, FHIR (Fast Healthcare Interoperability Resources) is particularly promising due to its flexibility and widespread adoption, which facilitate integration in modern healthcare IT environments [
22,
23].
Additionally, since the Da Vinci® system is a proprietary hardware and software created by Intuitive Surgical, accessing real-time device data may need specific agreements or partnerships with the manufacturer. These agreements might involve clinical or research collaborations to meet regulatory requirements and handle intellectual property issues. While negotiating these collaborations can be complex and take time, they are necessary to achieve seamless, near real-time integration. This integration would greatly improve the value of robotic surgical data in clinical workflows.
In addition, code should also be developed that integrates data from other types of surgical robots, not just those from the Da Vinci®.
This project was developed and tested on Ormaweb, the most widely used OR management system in Italy, ensuring broad applicability. The algorithm is based on generalizable matching rules (timestamps, durations, and procedure metadata), making it adaptable to other OR systems with similar scheduling and device data. Minimal adjustments would be needed to map equivalent fields or formats in alternative platforms.
Finally, the possibility of integrating this algorithm into near real-time monitoring pipelines opens interesting perspectives for the continuous analysis of robotic activity, cost evaluation and optimization of resource utilization. Considering the high cost associated with robotic surgery, access to accurate and timely data is crucial to ensure sustainable and efficient management of healthcare services.