Next Article in Journal
Research on Dual-Loop ADRC for PMSM Based on Opposition-Based Learning Hybrid Optimization Algorithm
Previous Article in Journal / Special Issue
Data-Driven Prior Construction in Hilbert Spaces for Bayesian Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Models for Predicting Postoperative Complications and Hospitalization After Percutaneous Nephrolithotomy

by
Laura Shalabayeva
1,*,
Pilar Bahílo Mateu
2,
Marc Romeu Ferras
1,
Javier Díaz-Carnicero
1,
Alberto Budía
2 and
David Vivas-Consuelo
1,*
1
Research Unit for Health Economics and Management, Polytechnic University of Valencia, Camí de Vera, s/n, 46022 Valencia, Spain
2
Department of Lithotripsy, La Fe University and Polytechnic Hospital, 46026 Valencia, Spain
*
Authors to whom correspondence should be addressed.
Algorithms 2025, 18(9), 558; https://doi.org/10.3390/a18090558
Submission received: 29 July 2025 / Revised: 26 August 2025 / Accepted: 28 August 2025 / Published: 4 September 2025

Abstract

PCNL treatment is often associated with complications of hemorrhagic or infectious origin, which can result in prolonged hospitalization. This study aims to develop predictive models using machine learning (ML) techniques to anticipate these outcomes. Multiple ML algorithms—including Logistic Regression, Decision Tree, Random Forest, and Extreme Gradient Boosting—were evaluated on separate validation and test datasets. The Random Forest model achieved the highest predictive performance for hospitalization need (AUC 0.726/0.736) and infectious complications (AUC 0.799/0.735). Threshold adjustment was applied to increase sensitivity, reducing false negatives. The interpretability of the models was ensured through SHAP analysis, identifying clinically meaningful variables. Risk factors for both hospitalization and infectious complications models included nephrostomy drainage, a neutrophils percentage higher than 80, Guy’s score of grade 4, leukocytes level higher than 15 or lower than 4.5, and balloon dilation, while protective features included tubeless intervention, easy localization of a stone, negative culture, and microorganism results. However, no model achieved acceptable performance for predicting hemorrhagic complications, likely due to limited data. These results suggest that AI-based models can contribute to risk stratification after PCNL. Further experiments with larger, multi-center datasets are needed to confirm these findings and improve the generalizability of the models.

1. Introduction

Nephrolithiasis is the medical term used to describe the presence of kidney stones, which are crystalline mineral formations composed of one or more substances found in urine, such as calcium, oxalate, phosphate, or uric acid [1]. The prevalence of this disease among the Spanish population is 5.06%, corresponding to a total of 2,233,214 cases [2].
Among the possible treatments for this disease are pharmacological therapy and different types of lithotripsy, such as extracorporeal shock wave lithotripsy (ESWL), which uses ultrasonic energy generated by an external source; retrograde intrarenal surgery (RIRS), in which a catheter is introduced through the urinary tract; and percutaneous nephrolithotomy (PCNL), which is considered the most invasive of the treatments mentioned above [3,4]. According to the European Association of Urology guidelines, this is indicated for patients with renal stones larger than 2 cm and lower pole stones larger than 1.5 cm. It is also recommended by the American Urological Association (AUA) for patients with staghorn calculi [5]. However, even though PCNL is considered a minimally invasive intervention, it can cause a variety of complications of hemorrhagic or infectious origin. Several studies estimate the complication rate of this surgery to be between 4% and 50.8% [6].
Recently, there has been a significant rise in the digitalization of data within the healthcare field, which has also led to an increased use of artificial intelligence (AI), here referring to computational systems capable of learning from data, recognizing patterns, and making autonomous decisions [7]. One of the subsections of AI is machine learning (ML), which identifies patterns in categorized datasets and can deal with nonlinear relationships that traditional statistical methods may fail to capture [7,8]. This innovative approach has already been applied, for example, to identify antimicrobial compounds with potential therapeutic properties [7]. Moreover, several ML-based studies in the PCNL field have been conducted to predict outcomes such as transfusion needs, stone-free status, the need for adjuvant treatment, and other postoperative complications [8,9]. Building on these advances, the present study aims to develop a Clinical Decision Support System specifically designed to identify patients at risk of post-surgery complications and hospitalization, using machine learning algorithms.

2. Materials and Methods

2.1. Database

The database includes data of 901 patients that were treated by means of PCNL at the Department of Lithotripsy at La Fe University and Polytechnic Hospital in the city of Valencia, in Spain, between March 2011 and December 2024. As shown in Figure 1, the whole data of a patient can be distinguished as three large data packs: preoperative variables, that are gathered during external consultation before the intervention; intraoperative variables, collected during the PCNL procedure; and postoperative variables, obtained from a blood test performed approximately 4 h after the surgery was completed.
This database is subject to confidentiality restrictions and cannot be shared publicly. Access to the data may be requested from the Department of Lithotripsy of Hospital Fe, subject to institutional approval and compliance with applicable ethical guidelines.

2.2. Exclusion Criteria

Since cancer patients are hospitalized after surgery independently of whether they present any complication or not, due to protocol, these were excluded from the study. Also, pediatric patients fall under the exclusion criteria. Thus, the study is left with 883 patients.

2.3. Predictors

Among the preoperative predictive variables, the following can be found: demographic data, such as sex, age, and Body Mass Index (BMI); results of microbiology tests: culture and microorganism results; the medical history of a patient that includes diabetes variable and classification of anesthesia risk according to the American Society of Anesthesiologists (ASA); and treatment type and a set of variables that describe the stone and its location: side, major diameter, minor diameter, number, location, Guy’s score, Hounsfield Units (HU) and distance.
During the PCNL intervention, the following variables are gathered: patient position, guidance of access, catheterization, contrast, dilation method, multi-trajectory, sheath caliber, fragmentation source, localization ease, duration, drains, and tubeless.
From the blood test results, levels of procalcitonin, leukocytes, and percentage of neutrophils (%neutrophils) are included in the study. The %neutrophils variable was calculated according to
% n e u t r o p h i l s = n e u t r o p h i l s l e u k o c y t e s × 100 ,

2.4. Variables of Interest

The variables of interest, also called labels, are hospitalization need, presence of a complication of hemorrhagic origin, and presence of a complication of infectious origin. All of these are binary. The hospitalization need variable is defined as positive when a patient presents at least one complication from the following list: hematoma, hematuria, sepsis, infection, perforation, reintervention, bleeding, embolization, or transfusion. Hemorrhagic complication is equal to one when a patient presents one of the following complications: hematoma, hematuria, perforation, bleeding, embolization or transfusion, while the infectious complication category includes sepsis or infection.

2.5. Exploratory Data Analysis

Within the framework of Exploratory Data Analysis (EDA), first, the Kolmogorov–Smirnov test is applied to the numerical features to deduce which of them are normally distributed. Then, to analyze their influence over the label, the t-test is used for variables of normal distribution, while the Mann–Whitney test is used for those not normally distributed. The influence of categorical variables over the label is evaluated by means of the Chi-squared test.
To perform the EDA, the statistical software IBM SPSS Statistics 30.0 was used.

2.6. Missing Data Treatment

Those patients with more than 30% of the data equal to missing values or NaN were excluded from the study as well as non-labeled patients. The rest of NaNs were imputed with median or mode value for numerical and categorical variables, respectively.

2.7. Treatment of Numerical Variables

For the numerical variables of the blood test, thresholding was applied. More specifically, procalcitonin was recodified with one if its value was higher than 0.5 and with zero when its value was lower than 0.5. %neutrophils was changed to one when the original value was higher than 80 and changed to zero for the rest of values. Regarding leukocytes, this variable was set to one if its original value was higher than 15 or lower than 4.5. In other cases, it was set to zero.

2.8. Treatment of Categorical Variables

With respect to categorical variables, one-hot encoding was applied to nominal variables.

2.9. Data Splitting

The database was divided into three parts: training, validation, and test set, in the proportion of 70%/15%/15%, respectively. The database presents some imbalances. For instance, patients that require hospitalization after the surgery represent one quarter, while patients that face hemorrhagic complications are 13.5%, and patients with infectious complications are 14.2%. Thus, stratification by label was especially important.

2.10. Model Development and Evaluation

To find the most suitable predictive model, Logistic Regression and three different models based on trees were studied: Decision Tree, Random Forest, and Extreme Gradient Boosting, as these models are the ones with greatest potential for explainability, which is very important in the clinical context. For all models, the penalization for incorrect prediction of minority class via the “class_weight” parameter was applied to deal with the imbalance of the label. Furthermore, in the case of Logistic Regression, it was very important to perform prior normalization of features. Following this, the models were evaluated by means of Accuracy, Sensitivity, Specificity, and Area Under the Curve (AUC) metrics, with most importance given to the latter metric.
Python programming language (version 3.11.9) and Visual Studio Code software (version 1.103.1) were used to build and evaluate the models.

2.11. Threshold Optimization

Once the most suitable model was chosen according to AUC metric, the optimal threshold needed to be identified. In this case, it was decided to prioritize correct detection of patients with a low possibility of hospitalization need or infectious complications, so they could be excluded from those needing healthcare resources. Thus, we needed to lower the threshold to achieve higher sensitivity. To evaluate the impact of the update, the Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) metrics were calculated.

2.12. Explainability

In order to deduce the most important set of features for the prediction and how they influence the prediction of a class, the SHAP package, which is based on cooperative game theory, was used. The resulting plot highlights the most relevant features, where each dot corresponds to a patient. The x-axis position indicates the magnitude and direction of the feature’s impact on the predicted outcome, while the color gradient reflects the feature value, with blue representing low values and red representing high values.

3. Results

3.1. Exploratory Data Analysis Results

The gender distribution of patients is balanced: 51.9% females and 48.1% males. The overall mean age is 54.5 ± 15.4 years, and the average Body Mass Index is 27.6 ± 5.8 kg/m2, which corresponds to slightly overweight individuals. Further results of the exploratory data analysis for numerical and categorical variables can be found in Table 1 and Table 2, respectively. To obtain the p-values for each feature in relation to the prediction label, a 95% confidence interval was used.

3.2. Hospitalization Need Predictive Model

Table 3 presents a comparison of the performance of the models with different architectures. The Random Forest model achieved the highest AUC metrics on both the validation and test sets, 0.726 and 0.736, respectively, and is therefore considered the most suitable approach. The ROC curves of the Random Forest model, together with their corresponding AUC values for the three dataset partitions, are shown in Figure 2.
Therefore, the threshold of the Random Forest model was moved from 0.5 to 0.29 to increase sensitivity, and the resulting confusion matrices and updated metrics can be observed in Figure 3 and Table 4, respectively. With this adjustment, a sensitivity of at least 90% was achieved. On average, 10 patients not requiring hospitalization could be correctly identified, potentially reducing the use of hospital resources. However, even with the lower threshold, some patients were still misclassified as not having complications: one in the validation set and two in the test set.

3.3. Hemorrhagic Complications Predictive Model

Table 5 presents a comparison of the performance of models for detecting patients with hemorrhagic complications. Since none of these models was able to achieve at least 70 AUC points in both the validation and test sets, it was concluded that, with the currently available data and models selected for this study, the objective of developing a model for predicting hemorrhagic complications cannot be accomplished.

3.4. Infectious Complications Predictive Model

A comparison of the performance of different models for predicting infectious complications is presented in Table 6. It can be noted that, as in case of the hospitalization need model, the most suitable architecture is Random Forest, with 0.799 and 0.735 AUC values for validation and test sets, respectively. The ROC curves of the Random Forest model, together with their corresponding AUC values for the three dataset partitions, are shown in Figure 4.
Once the best model was identified, the optimal decision threshold was adjusted. In this case, it was lowered from the default value of 0.5 to 0.15 in order to prioritize the correct identification of patients without infectious complications. The results can be observed in Figure 5 and Table 7. Similarly to the hospitalization need prediction model, lowering the threshold significantly reduced the number of patients incorrectly predicted as not having complications when, in fact, they did, which is highly critical in a clinical context. However, some false negatives still persist.

3.5. Explainability of the Models

To interpret the internal workings of the predictive models for the detection of hospitalization need and infectious complications, the SHAP (SHapley Additive exPlanations) library in Python was used. The results are shown in Figure 6.
In the model for predicting the need for hospitalization (Figure 6a), the following characteristics were identified as indicators of hospitalization: nephrostomy drainage; %neutrophils ≥ 80; Guy’s stone score grade 4; medium or difficult localization of a stone; intervention on the right side; leukocytes level ≥ 15 or ≤4.5; dilation method by balloon, staghorn-shaped stone; and the use of a holmium laser as the fragmentation source, while protection factors include tubeless intervention, easy localization of a stone, and negative culture and microorganism results. Additionally, two numerical variables showed a direct proportional relationship with the label: the largest diameter of stone and BMI.
In case of prediction of infectious complications (Figure 6b), indicators of presence of complications of infectious origin are as follows: procalcitonin > 0.5, nephrostomy drainage, %neutrophils ≥ 80, Guy’s stone score grade 4, ASA of grade 5, positive culture result, leukocytes ≥ 15 or ≤4.5, and the balloon dilation method. The protective factors were the same as in the hospitalization model, with the addition of ureteral catheterization. In this case, a direct proportional relationship with the outcome was observed only for the variable representing the largest diameter of the stone.

4. Discussion

Among the different model architectures tested, the Random Forest algorithm consistently demonstrated superior performance, achieving the highest AUC values for both the prediction of hospitalization need (0.726 and 0.736 in the validation and test sets, respectively) and infectious complications (0.799 and 0.735). These findings suggest that Random Forest is an effective approach for modeling clinical outcomes for PCNL intervention. In contrast, the models for detection of hemorrhagic complications did not reach acceptable predictive performance. This could be due to insufficient data availability or due to the need for a more sophisticated model. Thus, data enrichment is necessary through recruitment of additional centers, both in terms of patient numbers and features specifically related to hemorrhagic complications. Furthermore, ensemble modeling methods could be explored to better capture complex patterns. Notably, Yang et al. [9], who compared different models for the prediction of transfusion need, one of the most critical hemorrhagic complications, also obtained the highest AUC with Random Forest, thus supporting its relevance in this context.
Moreover, Geraghty et al. [8] carried out the largest study in the field of PCNL with a total of 12,810 patients. In their work, the model developed for predicting postoperative infection achieved AUC values ranging from 0.78 to 0.86, while the model for predicting postoperative complications—comparable to our hospitalization need model—achieved AUC values between 0.81 and 0.89. This indicates that recruiting a higher number of patients into the study may contribute to improved model performance.
Additionally, the decision threshold was adjusted to improve sensitivity, as false negatives in this context could have serious clinical consequences. This led to the development of an elimination-based model to discard patients who are unlikely to have a complication.
Among the strengths of our study are the evaluation of several machine learning models, the use of separate validation and test datasets, and application of interpretable AI methods such as SHAP to understand the contribution of clinical features.
However, there are several limitations. The dataset was gathered from a single center, which may limit the generalizability of the findings. Future research should focus on multi-center validation to ensure that the predictive models perform consistently across different populations and clinical settings. Additionally, the dataset included a relatively small number of samples for rare outcomes, and class imbalance was also a challenge.
Therefore, future research should focus on expanding the dataset across multiple institutions, incorporating a wider range of predictive variables. Moreover, if a sufficiently large sample size can be obtained, the development of complication-specific predictive models should be considered, as each type of complication may have distinct pathophysiological and procedural risk factors.

5. Conclusions

This study highlights the potential of machine learning, particularly Random Forest, in predicting key complications following PCNL. The models demonstrated good performance in identifying patients at risk of hospitalization or infection. These findings suggest that AI-based tools could support clinical decision-making and resource optimization in the PCNL field.

Author Contributions

Conceptualization, L.S., P.B.M. and M.R.F.; methodology, L.S., M.R.F. and J.D.-C.; software, L.S.; validation, L.S.; formal analysis, L.S.; investigation, P.B.M. and A.B.; resources, P.B.M. and D.V.-C.; data curation, L.S.; writing—original draft preparation, L.S.; writing—review and editing, L.S. and D.V.-C.; visualization, L.S.; supervision, D.V.-C. and A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of La Fe University and Polytechnic Hospital (protocol code: NLP-Ing, registration number: 2025-0540-1) on 27 May 2025.

Data Availability Statement

Access to the database may be requested from the Department of Lithotripsy of La Fe University and Polytechnic Hospital, subject to institutional approval and compliance with applicable ethical guidelines. The Python codes employed are available upon request to the authors.

Acknowledgments

The authors would like to thank John Wright for his assistance in improving the language and readability of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PCNLPercutaneous Nephrolithotomy
ESWLExtracorporeal Shock Wave Lithotripsy
RIRSRetrograde Intrarenal Surgery
AUAAmerican Urological Association
AIArtificial Intelligence
MLMachine Learning
BMIBody Mass Index
ASAAmerican Society of Anesthesiologists
UHHounsfield Units
EDAExploratory Data Analysis
AUCArea Under the Curve
PPVPositive Predictive Value
NPVNegative Predictive Value

References

  1. Liangos, O.; Jaber, B.L. Kidney Stones. In Humana Press eBooks; Humana Press: Totowa, NJ, USA, 2008; pp. 513–527. [Google Scholar] [CrossRef]
  2. Sánchez-Martín, F.; Rodríguez, F.M.; Fernández, S.E.; Tomás, J.S.; Barón, F.R.; Martínez-Rodríguez, R.; Mavrich, H.V. Incidencia y prevalencia de la urolitiasis en España: Revisión de los datos originales disponibles hasta la actualidad. Actas Urológicas Españolas 2007, 31, 511–520. [Google Scholar] [CrossRef] [PubMed]
  3. Del Pilar Alcoba García, M.; Serrano, G.B.; Jiménez, J.T.; López, R.G.; Ruíz, L.C.; Enguita, C.G. Is extracorporeal shock wave lithotripsy a treatment option for renal colic? Arch. Españoles De Urol. 2023, 76, 175. [Google Scholar] [CrossRef] [PubMed]
  4. Zong, D.; Shao, P. Flexible Ureteroscopic Holmium Laser Lithotripsy vs Percutaneous Nephrolithotomy for Renal Stones. Int. J. Pharmacol. 2024, 20, 593–601. [Google Scholar] [CrossRef]
  5. Ghani, K.R.; Andonian, S.; Bultitude, M.; Desai, M.; Giusti, G.; Okhunov, Z.; Preminger, G.M.; De La Rosette, J. Percutaneous nephrolithotomy: Update, trends, and future directions. Eur. Urol. 2016, 70, 382–396. [Google Scholar] [CrossRef] [PubMed]
  6. Baltar, C.F.; Corral, M.E.M.; Fentes, D.P. Predicting and Avoiding complications in percutaneous nephrolithotomy in the Era of Personalized Medicine: A scoping review. J. Pers. Med. 2024, 14, 962. [Google Scholar] [CrossRef] [PubMed]
  7. Ghaderzadeh, M.; Shalchian, A.; Irajian, G.; Sadeghsalehi, H.; Bialvaei, A.Z.; Sabet, B. Artificial Intelligence in Drug Discovery and Development Against Antimicrobial Resistance: A Narrative review. Iran. J. Med. Microbiol. 2024, 18, 135–147. [Google Scholar] [CrossRef]
  8. Geraghty, R.M.; Thakur, A.; Howles, S.; Finch, W.; Fowler, S.; Rogers, A.; Sriprasad, S.; Smith, D.; Dickinson, A.; Gall, Z.; et al. Use of Temporally Validated Machine Learning Models To Predict Outcomes of Percutaneous Nephrolithotomy Using Data from the British Association of Urological Surgeons Percutaneous Nephrolithotomy Audit. Eur. Urol. Focus 2024, 10, 290–297. [Google Scholar] [CrossRef] [PubMed]
  9. Yang, Y.; Cao, Z.; Wang, W.; Yang, C.; Wang, K.; Qiu, X. Application of Machine-learning models in Predicting Transfusion Among Complex Renal Stones Patients Receiving Percutaneous Nephrolithotomy: A Retrospective Study. Res. Sq. 2022. [Google Scholar] [CrossRef]
Figure 1. Flux of collecting data.
Figure 1. Flux of collecting data.
Algorithms 18 00558 g001
Figure 2. ROC curves for Random Forest model predicting hospitalization.
Figure 2. ROC curves for Random Forest model predicting hospitalization.
Algorithms 18 00558 g002
Figure 3. Hospitalization need confusion matrices after applying new threshold over: (a) validation set; (b) test set.
Figure 3. Hospitalization need confusion matrices after applying new threshold over: (a) validation set; (b) test set.
Algorithms 18 00558 g003
Figure 4. ROC curves for Random Forest model predicting infectious complications.
Figure 4. ROC curves for Random Forest model predicting infectious complications.
Algorithms 18 00558 g004
Figure 5. Infectious complications confusion matrices after applying new threshold over: (a) validation set; (b) test set.
Figure 5. Infectious complications confusion matrices after applying new threshold over: (a) validation set; (b) test set.
Algorithms 18 00558 g005
Figure 6. Most important features ranked by SHAP values for (a) hospitalization need predictive model; (b) infectious complications predictive model. Each dot represents a patient. The position on the x-axis (SHAP value) indicates the impact on the predicted outcome (negative values = decreased risk of complication, positive values = increased risk). The color scale reflects the feature value, with blue representing low values and red representing high values. For binary features, only red and blue points are shown (red = 1, blue = 0). Features are ranked by their overall importance in the model.
Figure 6. Most important features ranked by SHAP values for (a) hospitalization need predictive model; (b) infectious complications predictive model. Each dot represents a patient. The position on the x-axis (SHAP value) indicates the impact on the predicted outcome (negative values = decreased risk of complication, positive values = increased risk). The color scale reflects the feature value, with blue representing low values and red representing high values. For binary features, only red and blue points are shown (red = 1, blue = 0). Features are ranked by their overall importance in the model.
Algorithms 18 00558 g006
Table 1. Exploratory data analysis for numerical variables.
Table 1. Exploratory data analysis for numerical variables.
VariableMean ± SDp-Value
Hospitalization NeedHemorrhagic ComplicationsInfectious Complications
BMI
Largest diameter
27.6 ± 5.80.2180.3740.119
25 ± 13.3<0.0010.131<0.001
Smallest diameter15.2 ± 9.30.1360.6030.027
UH
Distance
Sheath caliber
1010.7 ± 3470.4480.2990.009
28.4 ± 36.90.0730.3290.398
25.3 ± 3.80.030.0650.080
Duration
Age
107.9 ± 38.90.0550.0010.721
54.5 ± 15.40.8240.8820.816
Table 2. Exploratory data analysis for categorical variables.
Table 2. Exploratory data analysis for categorical variables.
VariableFrequency per Possible Valuep-Value
Hospitalization NeedHemorrhagic ComplicationsInfectious Complications
Gender
Diabetes
Male: 48.1% Female: 51.9%0.0400.127<0.001
No: 84.3% Type 1: 3.9% Type 2:11.8%0.1910.0750.955
ASAASA I: 22.8% ASA II: 60.7% ASA III: 14.5% ASA IV: 2%0.0410.172<0.001
Culture result

Microorganism

Treatment type
Negative: 65.3% Positive: 33.7% Contaminated: 1.1%<0.0010.478<0.001
Negative: 66.1% Urease negative: 20.4% Urease positive: 13.4%<0.0010.427<0.001
Primary: 77.2% Post-ESWL: 8.4% Post-URS: 5.8% Post-NLP: 8.7%0.320.3340.371
Side

Guy score
Right: 47.9% Left: 51.4% Bilateral: 0.4% Transplant: 0.2%0.2210.5470.169
I: 24.5% II: 31.9% III: 25.9% IV: 17.7%<0.0010.086<0.001
PositionProne: 1.2% Supine: 98.8%0.8640.6530.706
Access
Catheterization
Fluoroscopy: 2.7% Ultrasound: 1% Both: 96.3%0.0330.8580.009
No: 11.3% Yes: 88.7%0.8550.2570.269
ContrastNo: 12.4% Yes: 87.6%0.530.3880.945
Dilatation methodAmplatz: 7.7% Dilation balloon: 44.6% Both: 44.4% Metallic: 3.3%0.2790.1200.766
Multi-trajectoryNo: 93.9% Yes: 6.1%0.8670.3390.583
Localization easeEasy: 43.4% Medium: 41.9% Difficult: 14.7%<0.0010.0070.244
TubelessNo: 60.9% Yes: 39.1%<0.001<0.0010.001
ProcalcitoninNormal range: 88.8% Out of normal range: 11.2%<0.001<0.001<0.001
LeukocytesNormal range: 85.3% Out of normal range: 14.7%<0.0010.081<0.001
%NeutrophilsNormal range: 51.2% Out of normal range: 48.8%<0.0010.018<0.001
LocationPelvis: 26.2% Superior calyx: 3% Middle calyx: 0.6% Inferior calyx: 9.4% Pseudocoraliform: 11.5% Coraliform: 26.3% Renal and ureteral: 2.9% Calyceal diverticulum: 0.9% Pelvis + upper calyceal group: 12% Multiple calyces: 3.6% Proximal ureter: 0.9% Distal ureter: 0.7% Pelvis + lower calyceal group: 2.1%0.0630.549<0.001
Fragmentation sourceLithoclast: 59.5% Ultrasound: 0.3% Lithoclast + ultrasound: 0.6% Holmium laser: 16.8% Basket: 2.8% Forceps: 4.9% Lithoclast + laser: 14.5% Irrigation: 0.5%<0.0010.0160.003
DrainsNo: 0.9% Nephrostomy: 17.8% Double-J stent: 36.9% Double-J stent + nephrostomy: 42.6% 24 h Ureteral catheter: 1% 24 h Ureteral catheter + nephrostomy: 0.9%<0.001<0.0010.002
Table 3. Summary of evaluation metrics of the models for hospitalization need prediction.
Table 3. Summary of evaluation metrics of the models for hospitalization need prediction.
Model TypeSubsetAUCAccuracySensitivitySpecificity
Logistic RegressionTrain0.7860.7230.6940.733
Validation0.6840.6560.50.721
Test0.6820.7180.6190.75
Decision TreeTrain0.7180.7970.3060.956
Validation0.6750.750.2860.941
Test0.5930.7290.190.906
Random ForestTrain0.7810.7100.6470.731
Validation0.7260.7080.6070.75
Test0.7360.7410.6190.781
Extreme Gradient BoostingTrain0.7710.7880.2180.973
Validation0.7340.7080.0710.971
Test0.6570.7530.0950.969
Table 4. Updated performance of Random Forest model predicting hospitalization after moving the threshold.
Table 4. Updated performance of Random Forest model predicting hospitalization after moving the threshold.
MetricValidation SetTest Set
Sensitivity0.9640.905
Specificity0.1620.141
PPV0.3210.257
NPV0.9170.818
Table 5. Summary of evaluation metrics of the models for hemorrhagic complications prediction.
Table 5. Summary of evaluation metrics of the models for hemorrhagic complications prediction.
Model TypeSubsetAUCAccuracySensitivitySpecificity
Logistic RegressionTrain0.7970.8620.1790.956
Validation0.7050.8570.10.959
Test0.6440.8590.0910.973
Decision TreeTrain0.8120.8770.2620.961
Validation0.7260.8330.20.919
Test0.5420.8590.0910.973
Random ForestTrain0.8660.8880.1070.995
Validation0.7160.8930.11.00
Test0.6330.8820.0911.00
Extreme Gradient BoostingTrain0.9720.9120.940.909
Validation0.7080.7740.30.838
Test0.6710.8240.2730.905
Table 6. Summary of evaluation metrics of the models for infectious complications prediction.
Table 6. Summary of evaluation metrics of the models for infectious complications prediction.
Model TypeSubsetAUCAccuracySensitivitySpecificity
Logistic RegressionTrain0.7780.8560.001.00
Validation0.6840.8540.001.00
Test0.7630.8550.001.00
Decision TreeTrain0.8990.8550.6460.89
Validation0.7270.8330.50.89
Test0.6560.7950.250.887
Random ForestTrain0.910.8870.6460.927
Validation0.7990.8850.2860.988
Test0.7350.8430.3330.93
Extreme Gradient BoostingTrain0.9760.9170.9390.913
Validation0.7160.8850.5710.939
Test0.6580.7950.4170.859
Table 7. Updated performance of Random Forest model predicting infectious complications after moving the threshold.
Table 7. Updated performance of Random Forest model predicting infectious complications after moving the threshold.
MetricValidation SetTest Set
Sensitivity0.9271.00
Specificity0.1220.07
PPV0.1530.154
NPV0.9091.00
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shalabayeva, L.; Bahílo Mateu, P.; Romeu Ferras, M.; Díaz-Carnicero, J.; Budía, A.; Vivas-Consuelo, D. Machine Learning Models for Predicting Postoperative Complications and Hospitalization After Percutaneous Nephrolithotomy. Algorithms 2025, 18, 558. https://doi.org/10.3390/a18090558

AMA Style

Shalabayeva L, Bahílo Mateu P, Romeu Ferras M, Díaz-Carnicero J, Budía A, Vivas-Consuelo D. Machine Learning Models for Predicting Postoperative Complications and Hospitalization After Percutaneous Nephrolithotomy. Algorithms. 2025; 18(9):558. https://doi.org/10.3390/a18090558

Chicago/Turabian Style

Shalabayeva, Laura, Pilar Bahílo Mateu, Marc Romeu Ferras, Javier Díaz-Carnicero, Alberto Budía, and David Vivas-Consuelo. 2025. "Machine Learning Models for Predicting Postoperative Complications and Hospitalization After Percutaneous Nephrolithotomy" Algorithms 18, no. 9: 558. https://doi.org/10.3390/a18090558

APA Style

Shalabayeva, L., Bahílo Mateu, P., Romeu Ferras, M., Díaz-Carnicero, J., Budía, A., & Vivas-Consuelo, D. (2025). Machine Learning Models for Predicting Postoperative Complications and Hospitalization After Percutaneous Nephrolithotomy. Algorithms, 18(9), 558. https://doi.org/10.3390/a18090558

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop