Predicting Safe Liver Resection Volume for Major Hepatectomy Using Artificial Intelligence

(1) Background: Advancements in the field of liver surgery have led to a critical need for precise estimations of preoperative liver function to prevent post-hepatectomy liver failure (PHLF), a significant cause of morbidity and mortality. This study introduces a novel application of artificial intelligence (AI) in determining safe resection volumes according to a patient’s liver function in major hepatectomies. (2) Methods: We incorporated a deep learning approach, incorporating a unique liver-specific loss function, to analyze patient characteristics, laboratory data, and liver volumetry from computed tomography scans of 52 patients. Our approach was evaluated against existing machine and deep learning techniques. (3) Results: Our approach achieved 68.8% accuracy in predicting safe resection volumes, demonstrating superior performance over traditional models. Furthermore, it significantly reduced the mean absolute error in under-predicted volumes to 23.72, indicating a more precise estimation of safe resection limits. These findings highlight the potential of integrating AI into surgical planning for liver resections. (4) Conclusion: By providing more accurate predictions of safe resection volumes, our method aims to minimize the risk of PHLF, thereby improving clinical outcomes for patients undergoing hepatectomy.


Introduction
In recent years, artificial intelligence (AI) has been increasingly applied in liver disease for diagnostic and predictive purposes.Byrne et al. [1] discussed the role of AI in diagnosing liver diseases, including fatty liver disease and hepatocellular carcinoma.The use of deep learning methods for surgical planning in liver resection has also been investigated [2].Machine learning algorithms, including decision trees, have been employed to predict postoperative liver failure [3], and Gruttadauria et al. [4] used decision trees in the allocation of liver transplants.
In liver surgery including hepatocellular carcinoma, Klatskin tumor, or hepatic metastasis, large-volume hepatectomy may be a preferred intervention for curative treatment.However, it is still a difficult problem to evaluate and predict post-hepatectomy liver failure (PHLF) because of the different liver functions in each patient.The success of this procedure is heavily dependent on the surgeon's ability to determine the appropriate liver resection volume.Failure in the accurate estimation of this safe resection volume would lead to PHLF, which accounts for a substantial proportion of mortality after the hepatectomy procedure [5,6].Achieving accurate predictions can significantly impact clinical decision-making, leading to improved patient outcomes and mortality rates.
The integration of AI in medicine is revolutionizing patient care across various domains.In diagnostics, K. Lång et al.'s study on AI-supported screen reading in mammography highlights the technology's efficacy in enhancing cancer detection [7].F. Sanfilippo et al. explored AI's application in patient monitoring, specifically in assessing inferior vena cava distensibility in mechanically ventilated patients, showcasing AI's potential use in critical care [8].Furthermore, Fadl H Veerankutty et al. discussed the emerging role of AI in hepatology, liver surgery, and transplantation, indicating its expanding influence in surgical procedures and research [9].These studies collectively underscore AI's growing significance in enhancing various aspects of patient care, from early detection to surgical interventions.
Various methods have been explored to predict PHLF, yet the results have led to a lack of consensus in the field [2,3,10,11].Recent advancements in predicting PHLF have revealed a variety of models emphasizing different aspects of liver function and resection.For instance, the model by Naruhiko Honmyo et al. [12] demonstrates the significance of liver function and the extent of liver resection in PHLF prediction.They suggested calculating formulae, such as VIPP (volume-associated ICG-PLT-PT) scores, for severe PHLF.Morandi et al. [13] and Tomassini et al. [14] discussed dynamic models of liver regeneration post-resection, highlighting the variability in outcomes based on individual differences in regenerative capacity and stressing the importance of factors such as metabolic load and cell death post partial hepatectomy.These studies collectively suggest the complexity of PHLF prediction, and underscore the need for interdisciplinary approaches that integrate various biological and computational insights.Therefore, the development of robust and reliable models is crucial for optimizing major hepatectomy procedures and for minimizing the risk of postoperative complications, particularly PHLF.Accurate liver volume estimations are indispensable in surgical planning and determining the applicability of the lesion applied in safeguarding the functional capacity of the remaining liver, and enhancing patient safety [15].Addressing this multifaceted challenge necessitates interdisciplinary collaboration between medical professionals and AI researchers [16].By harnessing AI and incorporating comprehensive clinical data, including laboratory biomarkers and volumetric measurements, it is possible to guide surgeons in making informed decisions that improve patient outcomes [2,10,11].However, the AI studies on liver surgery so far have mainly focused on the prediction of the possibility of PHLF.Therefore, the estimation of a safe liver resection volume is still a significant challenge in liver surgery.In this context, the present study aims to address the pressing concern of PHLF by proposing a novel approach utilizing AI techniques to predict a safe liver resection volume in patients undergoing right hemi-hepatectomy.

Methods
This study was performed in accordance with the ethical guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of Kosin University Gospel Hospital (No. KUGH 2023-09-010).Between January 2017 and August 2022, patients aged >19 years old and who had received right hemi-hepatectomy or right extended hemi-hepatectomy were retrospectively reviewed in a single tertiary medical center.We collected demographics and preoperative laboratory data, and computed tomographic (CT) volumetry calculating the right and left liver volumes.We excluded those who had received repeat liver resection or who did not have pre-operative CT or MRI images.In total, 135 liver surgeries were performed in our hospital from January 2017 to August 2022.Among them, a right hemi-hepatectomy was performed in 55 patients, but 3 patients were excluded due to the completion of the right hemi-hepatectomy after the first liver resection.Finally, 52 patients were analyzed in this study.
According to previous literature, the model for end-stage liver disease (MELD) [19] score was calculated as The albumin-bilirubin (ALBI) score [20] was calculated as ALBI score = (log 10 bilirubin × 0.66) + (albumin × −0.085) The aminotransferase-to-Platelet Ratio Index (APRI) score is a non-invasive clinical tool used to assess the degree of liver fibrosis, particularly in the context of chronic liver diseases such as hepatitis C. The APRI score is calculated using the formula: In this formula, AST refers to the level of aspartate aminotransferase in the patient's blood sample, and reference AST is the upper limit of the normal range for aspartate aminotransferase.The platelet count is measured in billions per liter (10 9 /L).The APRI score has been widely validated as a predictor of liver fibrosis, offering a cost-effective and readily accessible alternative to liver biopsy [21].Its simplicity and non-invasiveness make it a valuable tool in routine clinical practice for assessing liver health, especially in patients with chronic hepatitis C infection [22].Additionally, APRI scores have been used in various clinical settings to evaluate liver damage and predict outcomes in liver disease [23].
The reference value for AST was 40 IU, which is the upper normal limit in our laboratory [24].

Definition of PHLF
The 50-50 criteria for PHLF were used in this study [25].In PHLF, the "50-50 criteria" are defined as the concomitant presence of PT < 50% and SB > 50 µmol/L on post-operative day 5.We excluded the illness of increased total bilirubin due to biliary obstruction.We included mortalities caused by PHLF during the immediate postoperative period.In the predictive modeling of PHLF, it is critical to consider a range of factors beyond liver resection volume.This holistic approach is supported by recent literature, which emphasizes the multifactorial nature of PHLF risk.The etiology of liver disease, for example, is a significant factor.Diseases such as cirrhosis, hepatitis, and hepatocellular carcinoma each uniquely influence liver function and regeneration, impacting PHLF risk.A study by Sheta et al. [26] demonstrated that the presence of cirrhosis, irrespective of its cause, is associated with increased PHLF risk, highlighting the importance of considering liver disease etiology in PHLF prediction models.Moreover, the MELD score, a wellestablished prognostic tool for chronic liver disease, has also been identified as a crucial predictor of PHLF.As Chen et al. [27] found, incorporating MELD scores into preoperative evaluations can significantly enhance the prediction of surgical outcomes, including PHLF.Additionally, the potential role of biomarkers in predicting PHLF is gaining attention.Biomarkers such as serum bilirubin, albumin, INR, and platelet count, integral to the MELD score, have long been used in liver function assessment.Recent studies, including those by Prodeau et al. [28], have explored the use of emerging biomarkers such as indocyanine green clearance, hyaluronic acid, and PT activity in predicting PHLF.Our study acknowledges these multifactorial influences and proposes an extension of our predictive model to include these factors.By integrating etiological data, MELD scores, and relevant biomarkers into our model, we aim to provide a more comprehensive and clinically relevant tool for assessing PHLF risk.This approach aligns with current research trends and emphasizes the need for multifaceted risk stratification in liver resection procedures, thereby enhancing the precision and personalization of surgical planning.

CT Volumetry Measurement
The most recent CT or MRI images derived prior to the operation were used for the CT volumetry.The margin of the left and right hepatic lobes was traced manually along Cantlie's line, which divides the functional left and right hemi-livers [17], with consideration of the resection margins lateral to the middle hepatic vein and anatomical variations as depicted in Figure 1.The total volume of each hemi-liver was calculated by the addition of the area of hepatic parenchyma from each slice multiplied by the slice thickness.When we obtained the volumetric data of the left and right hemi-livers, we included the middle hepatic vein to the remnant left liver for right hemi-hepatectomy, and excluded the middle hepatic vein for extended right hemi-hepatectomy (Figure 2).

Statistical Analysis
Continuous variables are here presented as median and range.These variables were compared using the Mann-Whitney U test.Categorical variables are presented as numbers.The chi-squared test or Fisher's exact test were used to analyze categorical variables.Statistical significance was set at p < 0.05.All statistical analyses were performed using SPSS 25.0 for Windows (IBM, Armonk, NY, USA).In our study, we employed an AI method tailored for predictive analysis in the medical domain.The method, along with the tailored loss function, is architecturally grounded on the concept of residual networks [29], which are renowned for their ability to facilitate deeper learning without the hindrance of vanishing or exploding gradients.The input layer of our model is designed to accommodate 23 distinct laboratory data parameters derived from patients.The parameters are elaborated in Table 1.Parameters are considered with the exception of diagnosis.This input information is propagated through a cascade of processing blocks and operations to yield a singular floating-point value, representing the liver resection area.Specifically, the data flow comprises a sequence of transformations: starting with a ReLU [30] activation function, it advances through the first residual block (Res1) 3, then undergoes expansion (Expand1), followed by another ReLU activation.Subsequent blocks and operations include Res2, a shrink operation (Shrink3), ReLU, Res1, Shrink4, a base residual block (Res0) and Shrink5, culminating in the final output layer as shown in Figure 3.This sophisticated architecture harnesses the power of deep learning to offer predictions on liver resection areas based on the provided laboratory data.Our method begins with an initial layer that processes the input data through a linear transformation followed by ReLU activation, resizing the feature dimensions to a predefined hidden size.Subsequently, the data are passed through multiple instances of the Residual Block with various feature dimensions.To bolster the network's representational capacity, expansion layers have been integrated, which upscale the feature dimensions through linear transformations accompanied by ReLU activations.Once the data have been enriched in this expanded state, a sequence of shrinkage layers systematically reduces the feature dimensions.The information, having traversed these expansive and reductive stages, is then transformed by the final layers, consisting of a series of linear transformations and ReLU activations, to produce the end output.One of the reasons for the application of this architecture is its depth and intricacy, enabling the model to adeptly capture complex patterns and interactions among the 23 medical features (Figure 4).Furthermore, the residual connections ensure robustness against the vanishing gradient problem, making the network effective even with increased depth.The expansion layers in the architecture play a pivotal role in amplifying the feature space, granting the network a richer medium to discern and process intricate patterns.This depth, combined with the flexibility offered by various scales of residual blocks and multiple expansion and shrinkage layers, equips the network to adjust seamlessly to diverse complexities in the data.In essence, our approach embodies a blend of depth and architectural ingenuity, positioning it as an optimal choice for predicting resection volumes from a multifaceted set of medical features.

Loss Design
In our study, we have introduced a specialized loss function, "weighted liver loss", to address the unique challenges inherent in predicting liver resection volumes.This function is not merely a theoretical construct; it is deeply rooted in the practical realities and clinical exigencies of liver surgery.Here, we elaborate on the rationale behind this choice and its specific applicability to our problem.In the context of liver resections, the stakes of prediction accuracy are high.A slight underestimation of the resection volume can lead to the insufficient removal of pathological tissue, potentially leaving behind diseased areas.On the other hand, an overestimation, while not ideal, provides a safer margin as it errs on the side of removing more tissue, thus ensuring the complete excision of the diseased part.This clinical backdrop demands a predictive model that is attuned to the asymmetric consequences of prediction errors.Our "weighted liver loss" function is designed with this asymmetry in mind.It incorporates distinct penalties for under-predictions and overpredictions, reflecting the differential impact these errors have in a surgical setting.The higher penalty for under-predictions is a deliberate choice intended to align the model's predictions with clinical priorities-ensuring the complete removal of pathological tissue while safeguarding against the removal of excessive healthy liver tissue.
Given the predicted values y pred and the true values y true , the custom loss function "weighted liver loss" is defined as: Here: • N is the total number of samples; • SEi (squared error) is calculated as • wi is the weight, which is determined based on whether the prediction is an underprediction or over-prediction, as The "weighted liver loss" function is particularly designed to penalize under-predictions more heavily than over-predictions, with an additional logarithmic component to penalize large over-predictions.The parameters under_penalty and over_penalty allow for the customization of the degree of penalty for under-predictions and over-predictions, respectively.
In liver resection procedures, the accurate estimation of resection volume is critical for both the effectiveness of the surgery and the safety of the patient [31].In this work, we introduce a specialized loss function, namely, "weighted liver loss", which has been developed to suit this clinical scenario.This loss function sets itself apart from traditional loss functions such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and Huber Loss commonly used in machine learning applications [32,33].
The essence of the "weighted liver loss" lies in its nuanced approach to prediction errors.Traditional loss functions typically treat every deviation-whether an overestimation or an underestimation-from the true value uniformly.However, from a clinical perspective, these deviations are not equivalent.Overestimating liver volume provides surgeons with a conservative margin, inherently leading to a safer resection procedure.In contrast, underestimations can compromise safety.Recognizing this disparity, the "weighted liver loss" incorporates a weighting scheme, denoted as w i , that differentially penalizes under-predictions and over-predictions.
Furthermore, this function amalgamates both weighted and unweighted squared errors.While the weighted component ensures that the model is oriented towards making clinically safer errors, the unweighted part ensures that the magnitude of errors remains a pertinent metric in the model's evaluation.Such a design is not only reflective of clinical realities, but also ensures that the model remains grounded in its predictions.
Another salient feature of the "weighted liver loss" is its inherent flexibility.With parameters such as under-penalty and over-penalty, the function can be fine-tuned, ensuring its adaptability to evolving clinical insights or varied surgical scenarios.
In summary, the "weighted liver loss" offers a tailored approach to modeling liver volume predictions, aligning closely with the clinical objectives of liver resection procedures.Prioritizing overestimations underscores the emphasis on patient safety, making it a more suitable choice in this specific medical context than traditional loss functions.

K-Fold Evaluation
In the development of predictive models, particularly in the context of medical applications where datasets are often limited in size, it is imperative to validate model predictions effectively while maximizing the use of available data.To achieve this, we have employed the K-fold cross-validation technique, recognized for its efficacy in enhancing the reliability of validation results and optimizing data usage.This method allows for each data point in our dataset to contribute to both the training and validation processes, thereby minimizing the risk of model bias and overfitting.Our choice of K-fold validation is driven by the need for a rigorous evaluation of the model's performance.We evaluated each model with a K-fold validation technique to enhance the reliability of the validation result, and to efficiently use limited data by allowing each data point to be used for both training and validation.K-fold validation is a resampling procedure commonly used for evaluating machine learning models, particularly useful when dealing with relatively small datasets.In this technique, the dataset of size N is divided into k equal blocks, each containing N/K data points.Figure 5 explains the K-fold validation used for this research.During each iteration, one of these blocks serves as the validation set, while the remaining k − 1 blocks form the training set.Performance metrics, such as P, U, and O, are then calculated for each iteration as discussed in Section 3. Performance metrics are further explained in Figure 5.After completing k iterations, the average values of P, U, and O are computed to provide a comprehensive evaluation of the model.In this work, N is set as 52 and K is set as 5.

Characteristics of Patients
The demographic details of the patient sample are as follows.The sample consists of 42 males (6 in the PHLF group and 36 in the non-PHLF group) and 10 females (all in the non-PHLF group), with a median age of 57 years (range: 20-78 years).The BMI for the patients had a median of 24.4 kg/m 2 (range: 20.6-31.5 kg/m 2 ).In total, 18 patients had hypertension (4 in the PHLF group and 14 in the non-PHLF group); 14 patients had diabetes (4 in the PHLF group and 10 in the non-PHLF group); 12 patients had HBV (4 in the PHLF group and 8 in the non-PHLF group); 3 patients had HCV (all in the non-PHLF group); 6 patients had a history of alcoholism (1 in the PHLF group and 5 in the non-PHLF group).The American Society of Anesthesiologists (ASA) Physical Status Classification was also considered, with 8 patients falling under Class 1, 30 under Class 2 (4 in the PHLF group and 26 in the non-PHLF group), and 14 under Class 3 (2 in the PHLF group and 12 in the non-PHLF group).This classification is pivotal in determining patients' overall health and assessing the anesthesia risk before surgery.Several essential pre-operative lab values such as WBC, neutrophil counts, PLT counts, PT INR, PT percentage, TB levels, AST, ALT, Cr, HS-CRP, APRI, ALBI, and MELD were included.Volumetric data about the liver, such as the right liver volume (Rt.volume), left liver volume (Lt.volume), future liver remnant percentage (FLR %), and remnant liver volume to patient weight ratio (RLWR %), were also noted.We analyzed the risk factors of the PHLF, shown in Table 2. HBV and APRI showed significance in multivariate analysis.
Each model's merit was adjudicated based on its success rate in resection volume predictions and the mean absolute errors for both overestimations and underestimations.These metrics furnish a holistic insight into the models' clinical applicability and their potential to ensure patient safety during liver resections.
In the conducted study, results were compartmentalized into three distinct categories, each aiming to evaluate the reliability and accuracy of the AI model's predictions concerning resection volume.Firstly, the Percentage of Successful Resection Volume (P) was calculated by taking the number of test data points for over-predicted resection volume (O) and dividing it by the total dataset count.The purpose of this metric, P, is to ascertain the level of trust a physician can place in the AI model when determining the resection volume.An under-prediction by the model, wherein the estimated value falls below the actual resection volume, poses significant risks, potentially leading to liver failure in patients due to insufficient liver volume.On the other hand, the Mean Absolute Error of Over-predicted Resection Volume (O) was employed to investigate instances where the model's prediction might excessively exceed the real value.Such overestimations can be detrimental; if the model indicates a large resection area, it could render surgery infeasible because of an inadequate liver volume.Lastly, even in cases where the model did not exhibit overprediction, P was evaluated to ensure its consistency and proximity to the actual resection volume, solidifying its role as a trustworthy metric.
In our analysis, we conducted a comprehensive performance evaluation of various machine learning and deep learning models, comparing them against our proposed model using our patient data set.Key performance metrics included the Percentage of Successful Resection Volume (P), Mean Absolute Error of Under-predicted Resection Volume (U), and Mean Absolute Error of Over-predicted Resection Volume (O).To assess the statistical significance of the observed differences in performance metrics among the models, we performed an ANOVA tests for each metric, followed by post-hoc analyses using Tukey's HSD test to identify specific model pairs with significant differences.The ANOVA results reveal significant variations in performance across different models (P: F(13, 286) = 4.62, p < 0.001; U: F(13, 286) = 5.17, p < 0.001; O: F(13, 286) = 6.03, p < 0.001), suggesting that the choice of model substantially impacts the prediction accuracy of liver resection volume.Our model, particularly with our specialized loss function, demonstrates a significantly higher Percentage of Successful Resection Volume (p = 68.88%)compared to traditional models such as linear regression (p = 48.22%)and Support Vector Regression (SVR) (p = 45.77%).The difference is statistically significant, as indicated by the post-hoc comparisons (p < 0.05).In terms of Mean Absolute Error for under-predictions (U) and over-predictions (O), our model also outperforms most traditional approaches.Notably, our model with our loss function achieved the lowest U value (23.72) and a significantly lower O value (50.70) compared to models such as Polynomial Regression (Degree 3) and Random Forest, further underscoring its efficacy in balanced prediction (U: p < 0.01; O: p < 0.05 compared to Random Forest).These findings suggest that our model, especially when equipped with our loss function, offers a more accurate and reliable method for predicting liver resection volumes, addressing a critical need in preoperative planning for liver surgery.

Discussions 4.1. Comparison of the Proposed Method on Liver Resection Volume Prediction
In the context of liver resection surgery, accurate volume prediction is pivotal to ensure both surgical efficacy and patient safety.Our study encompassed a comprehensive range of machine learning and deep learning models to achieve this objective (Figures 6 and 7).Linear and Polynomial Regressions, although simpler methods, showed suboptimal performance compared to more advanced techniques.This underscores the inherent complexities in predicting liver resection volumes, which may not be adequately captured by linear models or even higher-degree polynomials [34,35].The Support Vector Regression and Decision Tree algorithms demonstrated modest performance.The limitations of these models may be attributed to their inability to capture the complex relationships among predictors or to overfitting the training data, respectively [36,37].Random Forest, an ensemble model, improved the prediction accuracy to some extent, likely benefiting from its robustness against overfitting [38].Interestingly, the Fully Connected Feedforward Neural Network delivered the most accurate predictions among the standard models.This result supports the growing body of evidence that deep learning models are capable of capturing intricate patterns in medical data [39].Notably, our proposed approach, optimized with a specialized loss function tailored for this clinical context, outperformed all other models in the majority of the metrics.With a 68.88%rate of successful resection volume prediction and significantly reduced mean absolute errors in both under-and over-predicted resection volumes, our approach sets a new benchmark in this specific domain.This illustrates that specialized algorithms can provide improved performance over generalized methods in medical applications.

Harnessing AI for Liver Resection Planning
To date, several studies have shown that preoperative liver failure and potential liver disease are potentially significant risk factors for PHLF [42].Total bilirubin, PLT count, PT level, and ICG-R15 are commonly used for the evaluation of liver function, and formulas made from these factors have proven useful.However, they are also limited due to simple calculation formulas and traditional analyses [43].Several recent studies have suggested predictive factors or predictive models of PHLF using AI.Although Wang et al. reported a machine-learning model for PHLF, most patients (89.9%) have mild hepatic resection and are prone to overfitting [10].Xu et al. developed a deep-learning model for predicting PHLF.However, the model's AUC was relatively small, at 0.740 on the right hepatic resection [2].Mai et al. found that the artificial neural network model showed good predictive abilities in both the development set (AUC: 0.880) and the validation set (AUC: 0.876), and achieved the corrective ability to fit well compared to the previous scoring system [11].However, they created models with five risk factors including PLT count, PT, TBil, AST, and future liver remnants, without including the demographics, viral status, or comorbidities of patients.After the AI models predict "yes" or "no" for PHLF, surgeons have only two options: "surgery" or "not".Therefore, there is a limit to receiving practical assistance in actual surgical decisions.However, our AI model suggests a safe hepatic resection volume for each patient.Surgeons can change their surgical plan to fit a safe liver volume.This is thought to be able to provide very intuitive assistance to the operator.The accurate prediction of liver resection volume enhances surgical planning and mitigates the risk of postoperative complications [44].The integration of AI in this sphere exemplifies the transformative potential of machine learning technologies in augmenting medical decision-making [45].As medical AI continues its upward trajectory, this research adds to the growing body of evidence showcasing its capabilities and offers fresh perspectives for data-driven patient care [46].The advent of artificial intelligence may address these challenges, by leveraging machine learning algorithms that can analyze complex data sets and make highly accurate predictions.Integrating AI into the preoperative planning process for major hepatectomy could therefore provide clinicians with more reliable estimates, contributing to safer surgical outcomes and lower rates of postoperative complications.

Limitations
While our approach provides valuable insights and advances in predicting liver resection volume using AI, it is not devoid of limitations.First and foremost, the methodology employed was a validation study grounded in retrospective single-center data.While retrospective analyses are instrumental in clinical research, they are inherently associated with biases due to non-randomization.The limitations of these models may be attributed to their inability to capture complex relationships among predictors or to overfitting the training data, respectively.Random Forest, an ensemble model, improved the prediction accuracy to some extent, likely benefiting from its robustness against overfitting [29].Second, the population was relatively small.Third, a pertinent issue is the potential discrepancy between the actual liver resection volume and the pre-operative CT volumetry.While CT volumetry offers an invaluable estimation of liver volume, discrepancies between these measurements and the actual resected volume could exist.Such differences, if significant, might affect the prediction accuracy of the model, implying a need for meticulous calibration and regular model updates based on newer, real-world data.Third, while our study presents a novel approach for predicting liver resection volume and demonstrates promising results, it is important to acknowledge certain limitations.Primarily, our comparisons, though extensive, did not include some of the latest state-of-the-art models that have emerged in the recent literature, such as the advanced deep learning techniques reported by Lakshmipriya et al. [47] and the specialized regression models discussed by Zeng et al. [48].Future work will aim to benchmark our model against these cutting-edge methods, further validating and enhancing the robustness of our approach in the context of liver surgery planning.Fourth, while this dataset provided valuable insights, particularly in differentiating between PHLF and non-PHLF cases based on various clinical parameters, it is crucial to acknowledge its limitations in terms of size and diversity.The dataset's demographic and clinical characteristics reflect the patient population of a specific medical center, which may not fully represent the global population undergoing liver resection.This geographic and demographic limitation may affect the model's applicability to other patient populations with different ethnicities, healthcare systems, or disease etiologies.Lastly, our dataset had a limited number of PHLF cases, restricting us from creating a separate AI model for each PHLF grade.The granularity that such a model could offer would have been invaluable in tailoring surgical decisions to individual risk profiles.Future research with a more diverse and sizable dataset containing a detailed grading of PHLF cases can build upon our foundational work to address this gap.

Conclusions
As far as we know, there have been no studies that predict a safe liver volume for preventing PHLF with AI.Our study highlights the potential of current machine learning and deep learning techniques in predicting safe liver resection volumes for major hepatectomy.More importantly, it advocates for the development of specialized models optimized for the intricacies of specific clinical tasks to improve surgical planning and patient outcomes.In the future, research on larger population from multiple institutions is likely to be needed to create a more accurate and predictive AI model that presents a safe liver resection volume.In addition, to confirm this, a prospective study is needed to ensure safe liver resection volume prediction through an AI model before hepatectomy.It is expected that the role of medical AI will expand in the direction of helping treatment decisions by predicting these risks and presenting intuitive information.

Figure 1 .
Figure 1.Virtual resection line is located on the right side of Middle hepatic vein (A).Virtual resection line on gall bladder level (B).

Figure 2 .
Figure 2. Virtual resection line is located on the left side of Middle hepatic vein (A).Virtual resection line on portal bifurcation level (B).

Figure 4 .
Figure 4.The proposed network architecture receives 23 laboratory data parameters from patients as input and produces a single output representing the liver resection area.

Figure 5 .
Figure 5. Explanation of K-fold validation used in our evaluation.The dataset of size N is split into k blocks with each block containing N/K data points.In each iteration, performance is measured in 3 different metrics.P, U, and O are explained in the figure.After all iterations, the average values of all P, U, and O are obtained.The rationale for applying the K-fold is to validate models with a relatively low number of data points.

Figure 6 .
Figure 6.Percentage of Successful Resection Volume (P) performance comparison of diverse machine learning and deep learning models against our patient data (ANOVA p < 0.001).

Figure 7 .
Figure 7. Mean Absolute Error of Under-predicted (U) and Over-predicted (O) performance comparison of diverse machine learning and deep learning models against our patient data.Note that the lower the error, the better the performance (ANOVA p < 0.001).

Table 1 .
Characteristics of post-hepatectomy liver failure (PHLF) and non-PHLF patients among 52 cases of right hemi-hepatectomy.

Table 2 .
Factors associated with post-hepatectomy liver failure.Model for End-Stage Liver Disease score; APRI, AST to platelet ratio index.