1. Introduction
Spinal disorders that require surgery are becoming more widely acknowledged as a global health concern, mainly due to population aging and the increase in degenerative illnesses [
1]. Over the past 15 years, spinal procedures have increased by 2.4 times [
2]. Minimally invasive (MIS) techniques have been included in spinal surgery in many ways, including percutaneous pedicle screws fixation. Since screws are the main anchors in posterior spinal fusion procedures, it is essential to insert them precisely. Despite the relatively few clinical adverse events linked to malposition, misplaced screws can lead to dural tears [
3], neurological/neurovascular injuries [
4], sub-optimal biomechanics [
5], and other visceral involvement [
6]. Additionally, it is believed that up to 15.7% of cases with pedicle malposition go unreported [
3]. In the meantime, the prevalence of pedicle screw problems varies greatly, from 1% to 54%, emphasizing the significance of anatomical features and patient-specific factors, including osteoporosis and deformities, in achieving successful screw insertion [
7]. MIS modalities mainly include computed tomography navigation (CTnav), fluoroscopic guidance, and robot-guided surgery. Multiple systematic reviews have reported substantially growing evidence in all three fields [
8,
9]. While fluoroscopy alone has been reported to have a 91.3% accuracy [
8], it has some drawbacks, like the obvious radiation exposure risk and ease-of-use limitations [
10], as well as the level of surgeons’ experience, which introduces a degree of unpredictability. On the other hand, both CTnav and robotic-guided surgery have been reported to have higher predictability irrespective of the surgeons’ level of experience (7.0% [
11] and 5.1% [
12] vs. 15.7% [
3], respectively) [
3], and robotic-guided surgery is still reported to have higher accuracy compared to fluoroscopic guidance [
11]. Furthermore, several recent meta-analyses and reviews have confirmed that robotic-assisted techniques achieve a far greater percentage of “perfect” (Grade A) and “clinically acceptable” (Grade A + B) pedicle screw insertions than freehand or fluoroscopic-guided methods. This advantage in radiographic accuracy is now well-established [
13]. Less postoperative pain, faster recovery times, and less muscle damage have made robotic guidance more widely used [
14]; however, this selection’s main drivers are accuracy, safety, and cost. Despite these advantages, significant barriers and confounding variables remain. The substantial capital investment and per-procedure costs are juxtaposed against an emerging body of literature arguing for its long-term cost-effectiveness by reducing revision rates and hospital stays [
15,
16]. Furthermore, the steep and highly variable learning curve, which one recent meta-analysis suggests may require 60 cases for many surgeons to overcome, acts as a significant clinical confounder [
17]. Similarly, patient-specific factors, particularly poor bone quality, present a crucial challenge; a 2024 systematic review identified osteoporosis as a significant risk factor for mechanical complications like screw loosening and pseudarthrosis, which can mask the benefits of precise, robot-guided instrumentation [
18]. These factors, along with case-specific challenges like high BMI or instrumentation in the technically demanding thoracic spine, complicate the interpretation of clinical outcome data [
19].
These difficulties have created a crucial research gap. Numerous meta-analyses have demonstrated radiographic accuracy of robotic guiding, but there is still much disagreement about whether this translates into clearly better clinical outcomes. For example, recent systematic reviews and large-scale studies have found that patient-reported outcomes, complication rates, and revision rates do not significantly improve because of this precision [
20]. Other recent meta-analyses, on the other hand, directly oppose this, stating that robotic aid does lead to decreased rates of surgical revision and postoperative complications [
21,
22]. This “accuracy-clinical outcome paradox” implies that the advantages of robotics might be strongly influenced by technology, surgical procedure, patient group, and handling of variables such as bone health and learning curve.
To explain this disparity, the field urgently needs high-quality and region-specific data. This study used robotic guidance to evaluate pedicle screw insertion accuracy, fusion rates, and complications in the Spanish population.
2. Materials and Methods
2.1. Study Design and Ethics
This is a consecutive, multicenter, observational, single-arm retrospective analysis study. During July 2020 to January 2025, adult patients who had robotically assisted pedicle screw implantation were included in this study. Due to its retrospective nature, the study was conducted in accordance with institutional ethical standards and applicable regulations.
All the data was anonymized and managed in compliance with the institutional data protection policies.
2.2. Patient Selection
Inclusion criteria were as follows:
Age ≥ 18 years.
Undergoing spinal fusion surgery involving thoracic, lumbar, or sacral pedicle screw placement.
Use of robotic guidance with the ExcelsiusGPS platform (Globus Medical Inc., Audubon, PA, USA) during screw placement.
Availability of postoperative CT imaging for accurate assessment.
Exclusion Criteria were as follows:
Patients younger than 18 years.
Procedures not involving pedicle screw instrumentation.
Cases in which robotic guidance was not utilized or was converted to freehand or navigation-assisted techniques intraoperatively.
Lack of postoperative CT imaging resulted in exclusion from the screw accuracy analysis only, but such patients were retained for all other clinical and radiological outcome analyses.
Incomplete or missing clinical or radiological data relevant to the analysis.
Revision surgeries.
Data collected included age, sex, clinical diagnosis, and presence of deformity, operative data including the treated anatomical levels, invasiveness, type of implants, preoperative/postoperative CT scans, fusion rates, and complications. Complications included infection, intervertebral cage subsidence, and osteolysis.
2.3. Robotic Workflows and Surgical Technique
The Excelsius GPS (Globus Medical Inc., Audubon, PA, USA) platform was used for all surgeries. Depending on the surgical needs, two workflows were used.
Preoperative workflow: A preoperative CT (
Figure 1A) of the affected area is loaded by robotic computers. At this point, the surgeon plans the position of the implants and cages as well as trajectories and alignment. Once the patient is positioned on the operating table (prone or lateral) and after draping the surgical field, DRB (Dynamic Reference Base) and SM (Surveillance Marker) were positioned in both iliac crests (
Figure 1B,C). A 2D X-ray is taken using a conventional C-arm. These images are merged with the preoperative ones (
Figure 1D). As a result of this procedure, a virtual 3D image is available to start surgery.
Intraoperative workflow: In those cases, in which it is necessary to do any bone work that eventually changes anatomy, or when we are operating on specific anatomical areas such as the upper thoracic or cervical spine, it is mandatory to acquire intraoperative 3D images. In this study and all cases, this was performed using the Ziehmn Vision RFD 3D system (Ziehm Imaging, Nuremberg, Germany). After positioning, the DRB and SM were placed on the left iliac crest. An ICT is placed over the area to be intervened, and a 3D scan is performed. The surgeon plans the position and trajectories of implants, and the procedure is ready to start.
A final real image is taken before the patient leaves the operating room to confirm the correct position of the implants.
2.4. Screw Accuracy and Fusion Assessment
Accuracy of the pedicle screws was assessed using the classification of Gertzbein and Robins (CITA). A four-grade system (A to E) was created to determine the pedicular screws’ position in a postoperative CT scan of the instrumented area (
Figure 2). Two fellowship-trained spinal surgeons who did not participate in the surgical procedures and were blind to all patient data and intraoperative results independently assessed postoperative CT scans. Axial and sagittal CT scans were used to assess screw accuracy using the Gertzbein–Robbins classification method. Consensus discussion was used to settle disagreements among reviewers to ensure consistent interpretation across all cases. Fusion was evaluated using postoperative CT scans obtained at approximately 6 months, defined using a bone trabecula passing through the affected area, consistent with our institutional follow-up protocol. We conducted a patient-level analysis, meaning that fusion status was assessed per patient rather than per individual spinal level.
2.5. Statistical Analysis
All statistical analysis in this study was performed using Jamovi software (Version 2.6). Categorical variables were presented as numbers and percentages, while continuous outcomes were reported as mean ± standard deviation if normally distributed and median (IQR) if the data were skewed. The normality of data was tested using the Shapiro–Wilk test. No inferential or regression analyses were performed, as the dataset lacked a control group and sufficient variability to support meaningful comparative or predictive modeling.
4. Discussion
The primary goal of emerging robotic spinal surgeries is to enhance the accuracy of screw placement compared to conventional freehand techniques, as screw misplacement can lead to significant complications, including revision surgeries and neurological or vascular deficits [
23]. Therefore, robotic surgeries aim to decrease postoperative complications and intraoperative errors, helping reduce revision surgeries and follow-up visits [
24]. However, to create a measurement tool for screw placement accuracy, Gertzbein and Robbins developed a scale that starts with Grade A for optimal positioning with no cortical breach [
25]. Grade B and C correspond to cortical breaches smaller and larger than 2 mm, respectively. Grade D indicates cortical breaches between 4 and 6 mm, while Grade E is for breaches larger than 6 mm.
On this scale, our study of 726 screws and 115 patients found that 584 (99%) screws were graded A, 5 screws (0.8%) were graded B, and 1 (0.2%) was graded C.
Our remarkable accuracy and fusion rates are validated by recent systematic reviews. For instance, a 2024 meta-analysis of 21 TLIF trials [
26] revealed that robot-assisted surgery produced 74% fewer proximal facet joint violations and about 12% greater “perfect” (Grade A) screw accuracy than fluoroscopic freehand (RR≈1.12). Overall, surgical durations were similar to traditional methods, even though robotic cases in that analysis had lengthier setup and operating times in RCTs. As most series claim ≥95% Grade A placement, our 99% Grade A rate is in line with the top outcomes in the current literature. Our fusion rate (89.5%) is also consistent with recent publications; Chang et al., for example, reported approximately 87% interbody fusion at 2-year follow-up in a robot-assisted TLIF group [
27]. These comparisons verify that our accuracy and fusion results are in line with existing standards.
Our findings also align with Khan et al., who studied 75 screws and 20 patients who underwent robotic screw placement for degenerative spinal pathologies, found that 74 (98.7%) of their screws were graded A, while only one screw was grade B [
28]. Additionally, in a prospective analysis, Lonjon et al. found similar accuracy in a cohort of patients with degenerative spinal diseases who underwent robotic spinal surgeries, with 97.3% of screws placed at a grade A and B [
29]. Similar results were also observed in a meta-analytical study of 6041 pedicle screw placements found that the robotic spinal surgery group doubled the rates of grade A optimal positioning compared to the freehand group, with an odds ratio of 2.43 (
p < 0.001) [
30]. This accuracy, attributed to detailed multi-dimensional imaging and continuous optical tracking of screw trajectories of robotic surgeries, reduces not only bony breaches but also damage to adjacent neural structures, such as leakage of cerebrospinal fluids or injury of nerve roots, whereas freehand surgeries, which depend on the surgeon’s tactile feedback and anatomical landmarks, introduce considerable precision variability, especially in complex and challenging areas such as thoracic spines [
29,
31,
32].
In contrast to our results, Ringel et al., in a randomized controlled trial of 60 patients and 298 screw placements, of which 146 were performed using a robotic system, found that 85% of screws were graded A and B [
33]. However, this lower accuracy in their study resulted from improper fixation of the robotic arm and the movement of the cannula during the screw placement [
34]. Similarly, Wang et al. studied 61 patients with degenerative spinal pathologies who underwent minimally invasive, robot-assisted TLIF surgery and found that only 85.4% (234 screws) were inserted without any cortical breach [
34]. They attributed this reduced accuracy to the movement of guide pins or violations of the lateral vertebral wall during the pedicle screws placement. However, these differences in accuracy are often attributed to variations in robotic systems that have distinct accuracy rates, preoperative imaging, Kirschner-wires requirement, or techniques for mounting the robotic arm (to bone, table, or floor) [
35]. In general, reduced screw placement accuracy in robotic assisted spinal surgeries could be explained by various mechanical factors, such as shifting, which is a change in the position of the robotic arm relative to the patient’s position, or skiving, in which vertical force on the surgical instrumentation, such as cannula or drills, deviate from the pre-established bonny trajectories [
36].
Our study found that 432 screws out of 436 screws in TLIF surgeries were placed with no cortical breaches, while only four screws had cortical breaches of less than 2 mm. These misplaced screws were scattered among different patients; no patient had more than one breach, and none of the impacted instances showed clinical signs or needed additional care. These accuracy rates were higher than those of Zhang et al., who studied 100 screws and 50 patients who underwent robotic TLIF surgery. Among these 100 screws, 85 were graded A, while 13 and 2 screws were graded B and C, respectively [
37]. On the other hand, all our PLIF screws achieved an optimal placement of Grade A. These results were also higher than those of Kim et al., who found that only 91% of robotic-assisted TLIF screws received Grade A, while 8% and 1% were graded B and C, respectively [
38].
Effective radiographic monitoring of fusion success rates is crucial for determining patients who may benefit from subsequent or revision surgery. A successful fusion is achieved when bone growth extends across at least 50% of the intended fusion area while maintaining the bone density achieved immediately after surgery [
39]. From a biomechanical perspective, radiographic confirmation of solid fusion across one side of the total fusion area is sufficient to indicate stable fusion, regardless of any radiolucent features on the contralateral side [
34]. In this context, our study demonstrated a successful fusion rate of 89.5% (85 out of 95 patients), with half of the fusion failures (5 out of 10) occurring in patients with degenerative spinal pathologies. Similarly, Chang et al. reported an 87.3% interbody fusion rate among 26 patients with degenerative spondylolisthesis who underwent robotic-assisted TLIF surgery [
27].
Additionally, three patients in our cohort had postoperative infections, and eight showed signs of osteolysis. Osteolysis is a biological reaction that occurs at the bone–implant interface, where implant particles stimulate macrophages, leading to osteoclast activation and bone resorption, which appears as radiolucency around the implant and results in progressive bone loss, periprosthetic fracture, and implant loosening [
40].
When it is non-symptomatic, it needs careful attention. If osteolysis progresses or symptoms appear, revision surgery is mandatory.
In our cohort, osteolysis was detected radiographically as progressively lucent zones around the pedicle screws or implants, a recognized sign of implant-associated bone resorption [
41]. All osteolysis cases were asymptomatic, so we managed them nonoperatively with close imaging follow-up. Only if radiographic lucency had progressed or symptoms of loosening appeared would we recommend revision surgery.
This approach follows standard practice, since isolated radiolucent “clear zones” (pseudarthrosis signs) are monitored unless clinical or imaging deterioration mandates intervention.
4.1. Learning Curve, Operative Time, and Economic Considerations
Robotic spine surgery is expensive and has a steep learning curve. Setup and operation times are more prolonged at the initial stage of the learning curve, but efficiency quickly increases with familiarity. For instance, Paramasivam et al. reported that following the first ~20 cases, robotic setup time decreased from 24 to 17 min [
42]. Similarly, following the initial learning period, operational time gaps between robot and traditional processes tend to diminish. Economically speaking, robotic platforms are expensive to build and maintain (typically >USD500K–USD1.2M) [
43]. There are conflicting published cost studies; some predict net savings from shorter OR times and lower incidence of complications, while others report higher index hospitalization costs because of longer surgeries and higher equipment costs [
43]. When assessing robotic technology, these pragmatic aspects (learning curve and cost) must be taken into account in addition to the clinical advantages.
4.2. Strengths and Limitations
To the best of our knowledge, this is the first study to evaluate the efficacy and safety of robotic spinal surgery in a Spanish population, improving applicability in local clinical practice. Moreover, using the established Gertzbein–Robbins scale allows for a direct comparison between our results and those from other centers. Additionally, the inclusion of diverse spinal pathologies (degenerative conditions, spondylolisthesis, deformities, and pathological/traumatic fractures) alongside both minimally invasive and open surgical approaches strengthens the generalizability of our results across different clinical scenarios. Despite these strengths, the lack of a randomized controlled design limits our ability to attribute our results to robotic surgeries alone. Also, the retrospective nature of our study introduces potential selection bias that could limit our results in other settings. Moreover, the diversity of implant types and surgical approaches, while reflecting real-world evidence, introduces confounding variables that further limit our ability to isolate the direct impact of robotic surgery on our population. Another drawback is that only 95 out of 115 patients (82.6%) had postoperative CT imaging available, which may have introduced some selection bias into the accuracy analysis. However, we think this had little effect on the validity of the accuracy results because CT imaging was carried out in accordance with standard institutional procedures rather than intraoperative concerns. The heterogeneity of surgical approaches (open vs. MIS; anterior vs. posterior) and patient pathologies further confounds our findings. These factors limit causal inferences and underscore the need for future prospective, controlled studies. Regression or correlation analyses were not performed to determine predictors of fusion rates or screw inaccuracy. Such connections should be investigated in future prospective multi-arm studies with larger and more varied cohorts.