Next Article in Journal
Heterogeneous UV–Fenton Process by Maize-Straw-Templated TiO2/Fe3O4 for the Degradation of Tetracycline: Optimization Using Response Surface Methodology
Next Article in Special Issue
Effect of Energy Integration on the Applicability of Extractive Heterogeneous Azeotropic Distillation
Previous Article in Journal
Dried Sourdough as a Functional Tool for Enhancing Carob-Enriched Wheat Bread
Previous Article in Special Issue
Selective Nickel Leaching and Preparation of Battery-Grade Nickel Carbonate from Copper-Rich Industrial Intermediate
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ultrasound-Assisted Extraction of Antioxidant Compounds from Pomegranate Peels and Simultaneous Machine Learning Optimization Study

by
Martha Mantiniotou
1,
Vassilis Athanasiadis
1,
Konstantinos G. Liakos
2,
Eleni Bozinou
1 and
Stavros I. Lalas
1,*
1
Department of Food Science and Nutrition, University of Thessaly, Terma N. Temponera Street, 43100 Karditsa, Greece
2
Department of Electrical and Computer Engineering, University of Thessaly, Sekeri Street, 38334 Volos, Greece
*
Author to whom correspondence should be addressed.
Processes 2025, 13(11), 3700; https://doi.org/10.3390/pr13113700
Submission received: 30 September 2025 / Revised: 8 November 2025 / Accepted: 14 November 2025 / Published: 16 November 2025

Abstract

The pomegranate, a widely consumed fruit, produces large quantities of waste, mainly from its peel. Pomegranate peels (PPs) contain high amounts of antioxidant compounds, such as polyphenols, flavonoids, and anthocyanins, which can be isolated from them and used for the benefit of humans and the environment. In the present work, a study of recovery of these compounds by ultrasound-assisted extraction (UAE) was carried out, whose parameters were optimized. The optimal results were a total polyphenol content of 195.55 mg gallic acid equivalents/g, total flavonoid content of 74.78 mg rutin equivalents/g, total anthocyanin content of 992.87 μg cyanidin 3-O-glucoside equivalents/g, and ascorbic acid content of 15.68 mg/g, while the antioxidant activity determined through ferric-reducing antioxidant power and DPPH assays was 2366.89 and 1755.17 μmol ascorbic acid equivalents/g, respectively. In parallel, an artificial intelligence (AI)-based framework was developed to model and predict antioxidant and phytochemical responses from UAE parameters. Six machine learning models were implemented on the experimental dataset, with the Random Forest (RF) regressor consistently achieving the best predictive accuracy. Partial dependence analysis revealed ethanol concentration as the dominant factor influencing outcomes, while ultrasonic power and extraction time exerted comparatively minor effects. Although dataset size limited model generalizability, the RF model reproduced experimental outcomes within experimental variability, underscoring its suitability for predictive extraction optimization. These findings demonstrate the complementary role of machine learning in accelerating antioxidant compound recovery research and its potential to guide future industrial-scale applications of AI-assisted extraction.

1. Introduction

The pomegranate (Punica granatum L.), a small deciduous tree native to the Middle East, is now widely cultivated across many regions worldwide, including the Americas, China, India, and the Mediterranean basin [1]. Beyond its nutritional value, pomegranate exhibits diverse pharmacological properties, such as anti-amoebic, antimalarial, anticoccidial, and anthelmintic effects against certain nematodes, cestodes, and intestinal trematodes [2]. It is consumed fresh or processed into a variety of products, including juice, syrup, jams, and wine [3]. The industrial production of pomegranate juice generates substantial by-products, primarily peel (~78% of the processing residue), which corresponds to nearly half of the fruit’s total weight [4]. Despite this abundance, pomegranate peels (PPs) are often discarded without consideration of their bioactive potential [5,6]. Rich in flavonoids, tannins, and other phenolic compounds with proven antibacterial activity, PPs represent a valuable resource for further exploitation [6].
The valorization of PPs has attracted increasing attention, prompting extensive research into their potential applications. Extracts from PPs have been incorporated into edible packaging materials to extend the shelf life of fresh-cut fruits and vegetables, meat and meat products, seafood, dairy, and baked goods [7]. In addition, PPs have been investigated as low-cost adsorbents for wastewater treatment [8]. Phytochemical analyses have revealed that PPs contain at least 23 polyphenolic constituents, including 11 tannins (e.g., ellagic acid, punicalagin) and 7 phenolic acids (e.g., protocatechuic acid, coumaric acid), along with five classes of flavonoids such as catechin and epicatechin [9]. These compounds confer notable antibacterial activity against a wide spectrum of Gram-positive and Gram-negative bacteria, including Escherichia coli, Listeria monocytogenes, Pseudomonas aeruginosa, Staphylococcus aureus, Salmonella spp., Bacillus spp., Vibrio parahaemolyticus, Clostridia, and Yersinia enterocolitica [10]. Collectively, these findings underscore the importance of further investigating PPs as a sustainable source of bioactive compounds and developing efficient extraction strategies for their recovery.
In recent years, significant attention has been directed towards the imperative of mitigating environmental effects [11]. Various industries are implementing innovative, sustainable extraction techniques to reduce environmental impact. These strategies achieve higher recoveries in shorter timeframes while minimizing solvent consumption, which is often harmful [12]. Ultrasound-assisted extraction (UAE) has been extensively reviewed as a versatile green technology with broad applications in food and natural product recovery [13]. Among these green techniques, ultrasonication stands out, as it generates negative pressure in the liquid medium, leading to acoustic cavitation, manifested as bubble formation when dissolved gases cannot remain in solution [14]. Comparative analyses further emphasize the advantages of green extraction over conventional techniques in terms of efficiency and sustainability [15]. Considering all this, the present study focuses on the isolation of antioxidant compounds from PPs via ultrasound-assisted extraction. Response surface methodology (RSM) was employed to optimize essential extraction parameters, while complementary statistical analyses (PCA, MCA, Pareto plots) were applied to elucidate the effects of solvent concentration, ethanol proportion, and extraction time on antioxidant recovery. Optimal conditions were determined using a partial least squares (PLS) model. Such approaches align with the broader trend of emerging green extraction techniques for valorizing agri-food by-products [16].
In parallel, the integration of artificial intelligence (AI), and particularly machine learning (ML), has become increasingly important for enhancing modeling, prediction, and optimization across the food, pharmaceutical, and nutraceutical sectors. Beyond mere predictive accuracy, AI frameworks increasingly serve as enablers of sustainable food systems and circular bioeconomy applications [17]. Recent reviews and studies highlight the growing role of ML in food chemistry and extraction optimization, underscoring its potential to accelerate discovery and scale-up [18]. Algorithms such as Random Forest (RF), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs) have proven effective in predicting antioxidant capacity, total phenolic content (TPC), and related biochemical responses from experimental variables [19]. For example, predictive modeling has been successfully applied to optimize bioactive compound extraction from plant matrices [20]. Similar intelligent modeling frameworks have also been applied in industrial process optimization, reinforcing the relevance of AI-driven approaches for scale-up [21]. However, most previous applications of ML rely on relatively large experimental datasets, which limits their direct applicability in processes such as ultrasound-assisted extraction of PPs, where experimental work is costly and sample sizes are constrained. This gap motivates the present study, which investigates whether AI can complement statistical optimization under small-data conditions.
The present study addresses this challenge by combining statistical optimization methods with ML-based predictive modeling. Six machine learning models were evaluated to estimate the antioxidant and phytochemical outcomes of ultrasound-assisted extraction under varying process conditions. To mitigate the limitations of the small dataset, synthetic samples were generated using RF-based resampling with Gaussian noise perturbation. Model performance was systematically assessed, with a particular focus on identifying whether AI approaches can reproduce and complement experimental optimization, ultimately supporting sustainable valorization of PPs through more efficient, adaptive, and scalable extraction workflows. Importantly, this study frames machine learning not only as a predictive tool but also as a component of intelligent optimization and decision support. By aligning with industrial process control paradigms, the methodology demonstrates how AI can complement classical statistical models in small-data regimes and provide actionable insights for process intensification and scale-up.

2. Materials and Methods

2.1. Chemicals and Reagents

L-ascorbic acid (99%), 2,4,6-tris(2-pyridyl)-s-triazine (TPTZ, ≥98%), 2,2-diphenyl-1-picrylhydrazyl (DPPH), aluminum chloride (AlCl3 98%), sodium acetate (CH3COONa, trihydrate, 99%), trichloroacetic acid (TCA, pure, 99.5%) and hydrochloric acid (37%) were purchased from Sigma-Aldrich (Darmstadt, Germany). Sodium carbonate (Na2CO3 anhydrous, 99.5%) and rutin (≥94%) were supplied by Penta (Prague, Czech Republic). Iron (III) chloride hexahydrate (97%) was obtained from Merck (Darmstadt, Germany). Folin–Ciocalteu reagent, gallic acid (97%), and ethanol (99.8%) were provided by Panreac Co. (Barcelona, Spain). Deionized water used in all experiments was prepared with a mixed-bed ion exchange resin column, ensuring conductivity below 1 μS/cm under standard flow rate and operating pressure conditions.

2.2. Instrumentation

Pomegranate peels (PPs) were freeze-dried using a BK-FD10P lyophilizer (Biobase, Jinan, China). The dried material was subsequently sieved with an Analysette 3 PRO system (Fritsch GmbH, Oberstein, Germany). Ultrasound-assisted extraction was carried out in an Elmasonic P70H ultrasonic bath (Elma Schmidbauer GmbH, Singen, Germany), while an Elmasonic S100 bath from the same manufacturer was employed for heating during the assays. Photometric measurements were performed with a Shimadzu UV-1700 PharmaSpec UV/Vis spectrophotometer (Kyoto, Japan). Finally, liquid extracts were clarified by centrifugation using a NEYA 16R centrifuge (Remi Elektrotechnik Ltd., Palghar, India).

2.3. Pomegranate Peel Material Handling

Fresh pomegranate fruits were obtained from a local grocery store in Karditsa, Greece, and transported to the laboratory immediately after purchase. The fruits were thoroughly washed with tap water, dried with disposable paper towels, and carefully peeled using stainless steel knives. The collected peels were subjected to overnight lyophilization, resulting in a measured moisture content of 74%. The dried material was subsequently ground in a blender and sieved to obtain a fine powder with an average particle size of 285 μm. The resulting powder was stored at −40 °C until further use in the experiments.

2.4. Experimental Design

A Box–Behnken design within the framework of RSM was applied to fine-tune the extraction conditions for key phytochemical and antioxidant parameters—namely total polyphenol content (TPC), total flavonoid content (TFC), total anthocyanin content (TAC), ascorbic acid content (AAC), and antioxidant activity (evaluated through FRAP and DPPH assays)—from PP powder using ultrasound bath-assisted extraction (UAE). The process optimization focused on three independent factors: ethanol concentration (C, % v/v) labeled as X1, ultrasonic power (E, %) as X2, and extraction time (t, min) as X3. Each variable was assessed at three levels, as outlined in Table 1.
To assess the reproducibility of the method, 15 experimental runs—including three replicates at the central point—were conducted. Triplicate measurements were taken for each run, and the mean values were used for subsequent data analysis. A second-order polynomial model was developed via least squares regression to enhance prediction accuracy and to capture the interactions between the three independent variables (Equation (1)):
Y k = β 0 + i = 1 2 β i X i + i = 1 2 β i i X i 2 + i = 1 2 j = i + 1 3 β i j X i X j
where the independent variables are represented by Xi and Xj, while the predicted response is denoted as Yk. The intercept (β0) and the regression coefficients (βi, βii, βij) correspond to the linear, quadratic, and interaction effects, respectively.

2.5. Bioactive Compounds Quantification

2.5.1. Determination of Total Polyphenol Content (TPC)

A previous spectrophotometric approach [22] was used to quantify. In brief, 100 μL of properly diluted sample were mixed with 100 μL of Folin–Ciocalteu’s reagent, and after 2 min, 800 μL of 10% w/v Na2CO3 were added. The mixture was incubated at 40 °C for 20 min and then, the absorbance was recorded at 740 nm. Using a calibration curve for gallic acid (0–100 mg/L, R2 = 0.999), it was feasible to determine the TPC. TPC was calculated in milligrams of gallic acid equivalents (GAE) for every gram of dry weight (dw) using the following Equation (2):
TPC   ( mg   GAE / g   dw ) = C TP × V w
where CTP is the total polyphenol concentration in mg GAE/L, the volume of the extraction medium is indicated with V (expressed in L) and the dry weight of the sample as w (expressed in g).

2.5.2. Determination of Total Flavonoid Content (TFC)

Using a previously described technology, TFC was used to identify a subset of polyphenols [12]. An aliquot of 100 μL of properly diluted sample was mixed with 40 μL of a mixture consisted of 5% w/v AlCl3 and 0.5M CH3COONa, and 860 μL of 35% v/v ethanol (aqueous). The emerged mixture was kept in the dark for 30 min and then the absorbance was recorded. Quercetin 3-O-rutinoside (Rutin) calibration curve (30–300 mg/L in methanol, R2 = 0.9966) was used to measure TFC at 415 nm, the data were expressed as mg RtE per g of dry weight (dw), according to Equation (3):
TFC   ( mg   RtE / g   dw ) = C TFn × V w
where CTFn is the total flavonoid concentration in mg RtE/L, V is the volume of the extraction medium (in L), and w is the dry weight of the sample (in g).

2.5.3. Determination of Total Anthocyanin Content (TAC)

Using a well-established methodology, the total pigments were determined [23]. The ethanolic HCl solution was used as a blank to determine the absorbance at 520 nm. Following these steps, the total pigment concentration (CTPm) was calculated in cyanidin 3-O-glucoside equivalents (CyE), according to the following Equation (4):
C TPm   ( μ g   CyE / L ) = A × MW × F D ε × l × 10 6
where A is the absorbance at 520 nm, MW is the cyanidin 3-O-glucoside molecular weight (449.2 g/mol), FD is the dilution factor, ε = 26,900 L/(mol·cm), and the path length (l) is 1 cm. Quantification was performed using the known extinction coefficient of cyanidin 3-O-glucoside; therefore, no calibration curve was required and R2 reporting is not applicable.
TAC was then determined as follows in Equation (5):
TAC   ( μ g   CyE / g   dw ) = C TPm × V w
where CTPm is the total anthocyanin concentration in μg CyE/L, V is the volume of the extraction medium (in L), and w is the dry weight of the sample (in g).

2.5.4. Determination of Ascorbic Acid Content (AAC)

The ascorbic acid content (AAC) was determined using a method that has been successfully used in the past [24]. A total of 100 μL of each sample was mixed with 500 μL of 10% v/v Folin–Ciocalteu’s reagent and 10% w/v TCA. The mixture was kept in the dark for 10 min and then, the absorbance was recorded at 760 nm. A calibration curve of ascorbic acid with concentrations ranging from 0 to 500 mg/L (R2 = 0.998) was utilized to assess the results, according to Equation (6):
AAC   ( mg / g   dw ) = C A A × V w
where CAA is the ascorbic acid concentration in mg/L, the volume of the extraction medium is indicated with V (expressed in L) and the dry weight of the sample as w (expressed in g).

2.6. Antioxidant Assays

2.6.1. Ferric-Reducing Antioxidant Power (FRAP) Assay

In order to assess FRAP, a well-established method [12] was employed. In brief, 50 μL of properly diluted sample were mixed with 50 μL of FeCl3 (4 mM in 0.05 M HCl) and incubated at 37 °C for 30 min. After that, 900 μL of TPTZ (1 mM in 0.05 M HCl) were added, and about 5 min later, the absorbance was measured at 620 nm. A calibration curve for ascorbic acid in 0.05 M HCl with values ranging from 50 to 500 μM (R2 = 0.995) was used to determine the ferric-reducing power (PR). Using Equation (7), the PR was determined as μmol of AAE per g of dry weight.
P R   ( μ mol   AAE / g   dw ) = C A A × V w
where CAA is the ascorbic acid concentration in μmol/L, V is represented (in L) as the entire volume of the extraction medium and w (in g) represents the dried weight of the material.

2.6.2. DPPH Antiradical Activity Assay

Using an already established DPPH technique [19], the antioxidant activity of PP extracts was further evaluated through inhibition activity. A quantity of 25 μL of properly diluted sample were mixed with 975 μL of 0.1 mM DPPH (in methanol). After 30 min in the dark, at 515 nm, the absorbance was recorded. In addition, the absorbance was tested instantly using a blank sample in place of the sample containing DPPH solution and methanol. The inhibition % was determined using Equation (8):
Inhibition   % = A 515 i A 515 f A 515 i × 100
An ascorbic acid calibration curve (0–1000 μM, R2 = 0.993) in Equation (9) was used to evaluate antiradical activity (AAR), which was expressed as μmol AAE per g of dw:
A AR   μ mol   AAE / g   dw = C AA × V w
where CAA is the ascorbic acid concentration in μmol/L, the volume of the extraction medium is indicated with V (expressed in L) and the dry weight of the sample as w (expressed in g).

2.7. Statistical Analysis

The statistical analysis was conducted using JMP® Pro 16.0.0 software (SAS, Cary, NC, USA) and was amenable to RSM and distribution analysis. Each batch of PP extracts underwent at least two rounds of extraction, and the quantitative analysis was carried out in triplicate. The data’s normality was checked using the Kolmogorov–Smirnov test. A post hoc Tukey HSD (Honestly Significant Difference) Test Calculator (Tukey HSD applied with Tukey–Kramer formula) in conjunction with a one-way analysis of variance (ANOVA) was applied to find statistically significant differences. Standard deviations and means are used to depict the outcomes. The use of JMP® Pro 16 software allowed for the execution of several statistical analyses, including Pareto plot, PCA, MCA, and PLS analyses.

2.8. Initial Dataset Exploration and Visualization

We analyzed n = 15 experimental conditions of PP extraction. The three process variables (C, E, t) served as predictors, and six antioxidant responses were modeled as simultaneous targets: TPC, TFC, TAC, AAC, FRAP, and DPPH. Duplicate rows were removed; the residual dataset contained no missing values after median imputation where necessary. The histograms of the predictors indicate broadly even coverage of the design space with modest central concentration—most clearly for E around 75–90 and for t near 15—and thinner support at the extremes. No conspicuous outliers are evident. The response distributions exhibit heterogeneous scale and dispersion: FRAP and DPPH show the largest spread with mild right-skewness, AAC and TFC are comparatively compact, and TPC and TAC lie between these extremes. These properties motivate standardization of predictors and responses prior to model fitting and support reporting performance per target rather than only as a macro-average, while the limited density at the edges cautions against extrapolation beyond the observed ranges. The interaction between the variable process and the responses are depicted in Figure 1.
Extending the distributional view from the histograms (Figure 2), the combined boxplots provide a compact summary of central tendency and dispersion across predictors and responses in their original units. The predictors display moderate variability—C and E span most of their design ranges, whereas t remains concentrated near mid-values—without conspicuous outliers. Among the responses, FRAP and DPPH present the widest interquartile ranges and the longest upper whiskers, indicating high dispersion and mild right-skewness; TAC shows intermediate spread; AAC and TFC are comparatively tight. The heterogeneity of scale across endpoints, together with asymmetric whiskers for FRAP and DPPH, suggests potential heteroscedasticity and reinforces the choice to standardize targets prior to model fitting and to report per-target metrics alongside both RMSE and MAE.
Building on the boxplot summary, the heatmap of the data matrix (Figure 3) in raw units highlights the stark differences in scale across variables. The responses FRAP, DPPH, and TAC dominate the color range due to their large magnitudes, whereas AAC and TFC appear comparatively uniform, and the predictors occupy a narrow band, making cross-variable visual comparison difficult. The pattern nonetheless suggests substantial between-sample variability in FRAP and DPPH, consistent with the wider interquartile ranges noted previously, while no single sample exhibits an extreme profile across all variables. This view reinforces the need to harmonize scales prior to modeling and to interpret raw magnitudes with caution.
After standardization to z-scores, the normalized heatmap (Figure 4) reveals within-sample profiles and co-variation patterns that are obscured on the raw scale. Several samples exhibit concurrent high standardized values for FRAP and DPPH, with TAC often elevated in the same profiles, while AAC remains comparatively stable across samples. The predictors display complementary contrasts across C, E, and t, indicating that the design covers distinct combinations rather than single-factor gradients. This standardized view supports the multi-output modeling strategy by making relative deviations comparable across variables and by suggesting shared structure among a subset of responses.
Consistent with the standardized heatmap, the correlation matrix quantifies linear relationships among predictors and responses and reveals a clear response block. FRAP, DPPH, TFC, and TAC are strongly and positively associated with one another (r = 0.82–0.93), with TPC moderately aligned to this block (r = 0.61–0.75), whereas AAC shows only moderate correlations to the other responses (r ≈ 0.48–0.63). Among predictor–response relationships, C is negatively associated with all targets, most strongly with AAC (r = −0.88) and TPC (r = −0.69), while E and t exhibit weak to very weak linear associations with the targets (|r| ≲ 0.20). These patterns suggest shared process drivers across several antioxidant endpoints, partial decoupling of AAC, and limited purely linear signal from E and t, supporting the use of flexible, interaction-aware models and per-target evaluation. Given the small sample size, these coefficients should be interpreted as exploratory. Figure 5 illustrates the Pearson correlation matrix of the responses under investigation.

2.9. Regression Modeling Framework

A total of 15 experimental conditions of UAE were analyzed. The three process variables, C, E, and t, were used as predictors, while six antioxidant responses, namely TPC, FRAP, DPPH, TFC, TAC, and AAC, were modeled as simultaneous targets. Duplicate rows were removed and no missing values remained after median imputation where necessary. To prevent information leakage, the dataset was split once into training (80%) and testing (20%) with a fixed random seed (42). All preprocessing transformations, including scaling, were fit only on the training set and subsequently applied to both partitions. Predictors and targets were standardized with a training-only StandardScaler, and predictions were inverse transformed before the calculation of errors in original units. Each model was implemented in a multi-output regression framework. Hyperparameters were selected by grid search using GroupKFold cross-validation, with folds ranging from two to five depending on group cardinality. Groups were defined by training indices and, in augmentation experiments, by the anchor observation to which synthetic points were attached. The selection metric was negative mean squared error computed on standardized targets. After tuning, models were refit on the full training data and evaluated on the held-out real test set. Because of the small sample size, nested leave-one-out cross-validation per target was used as the primary estimate of generalization. For each left-out observation, inner K-fold cross-validation selected hyperparameters before producing a single out-of-fold prediction.

2.10. Candidate Regressors and Hyperparameter Grids

A diverse set of regressors was benchmarked to capture linear structure, interactions, and nonlinear effects using standardized predictors and targets. The portfolio comprised a degree-2 polynomial regression with Ridge regularization (RSM_Ridge), Partial Least Squares (PLS), K-nearest neighbors (KNN), Support Vector Regression with a radial-basis kernel (SVR-RBF), a compact multilayer perceptron (MLP), Random Forest (RF), Extra Trees (ET), and XGBoost (XGB) within a multi-output wrapper. For methods that are not natively multi-output like SVR, MLP, KNN and XGB, we used a multi-output strategy. PLS, RF, and ET natively handle multi-response regression. Hyperparameters were tuned by grid search under GroupKFold cross-validation on the standardized space, and final errors were reported in original units after inverse transformation. Grids were deliberately conservative to limit variance inflation in a small-sample regime.
RSM_Ridge served as a response-surface baseline that includes all linear, squared, and pairwise interaction terms among the three predictors while controlling coefficient magnitude via an L2 penalty. The single tuned parameter was the regularization strength α, evaluated at 0.1, 1.0, 10.0, and 100.0. This range spans weak to strong shrinkage, allowing the model to trade bias for variance without resorting to higher-degree polynomials, which are prone to overfitting at n ≈ 15. PLS was used to model low-rank linear relations between predictors and multiple responses; its primary complexity lever is the number of latent components. We explored one, two, and three components, which is appropriate given three predictors and helps avoid over-parameterization while still allowing shared structure across responses to be captured.
KNN provided a nonparametric local-averaging baseline. Its behavior is governed by the neighborhood size k and the weighting rule used to combine neighbors. We considered k ∈ {3, 5, 7} to balance smoothness against responsiveness to local variation and compared uniform with distance weighting, the latter assigning greater influence on closer points in Euclidean distance after standardization. We did not expand the grid to alternative metrics to avoid unstable fit in a very small sample and because z-scoring renders Euclidean distance a sensible default.
SVR with an RBF kernel was included to capture smooth nonlinearities with a small set of well-understood hyperparameters. We tuned the soft-margin constant C ∈ {1, 10}, which controls the penalty assigned to training errors, and the ε-insensitive tube ε ∈ {0.1, 0.2}, which governs the tolerance band around the regression function. The kernel scale γ was kept at the library default (“scale”), which adapts to input variance and has been found to be more robust than free γ tuning in small-n settings; exploratory widening of the γ grid tended to overfit and was therefore not adopted.
The MLP offered a flexible parametric alternative with capacity controlled by the hidden-layer configuration and an L2 penalty on the weights. We examined architectures (64), (64, 64), and (128) hidden units and penalties α ∈ {1 × 10−4, 1 × 10−3}. To avoid invalid internal validation splits on very small training sets, early stopping was disabled, and the maximum number of iterations was set to 1000 to ensure convergence without excessive training. The activation “ReLU” and optimizer “Adam” were left at defaults to limit the search space and reduce variance.
Tree ensembles were prioritized because they naturally encode interactions and nonlinearity while remaining robust with limited data. For RF and ET, we fixed the number of trees at 500–600 to stabilize predictions and tuned the two principal complexity levers: maximum depth and minimum leaf size. Specifically, we evaluated max_depth ∈ {None, 6, 10}, with “None” denoting unrestricted growth—and min_samples_leaf ∈ {1, 2, 4}. Smaller leaves and deeper trees increase variance, shallower trees and larger leaves increase bias. This grid traverses bias- and variance-dominated regimes without introducing additional hyperparameters, which can be fragile with n ≈ 15. RF used bootstrap aggregation with randomized feature selection at splits. ET further randomized split thresholds, offering a higher-bias, lower-variance counterpart.
XGB was evaluated within a multi-output wrapper using shallow trees and modest learning rates chosen for their stability in small samples. We tuned n_estimators ∈ {300, 600}, max_depth ∈ {3, 6}, learning_rate ∈ {0.03, 0.10}, subsample ∈ {0.7, 1.0}, colsample_bytree ∈ {0.7, 1.0}, min_child_weight ∈ {1, 5}, reg_lambda ∈ {1.0, 5.0}, and reg_alpha ∈ {0.0, 0.1}. Early stopping was not used because it would require carving a validation split from an already small training set. Instead, we relied on cross-validated grid search. On Table 2 the tuned parameters and the tested values are summarized.

2.11. Training Augmentation by SMOTE Interpolation

To assess whether synthetic variability could help a model trained on a very small dataset, a SMOTE-like interpolation was explored solely within the training set and only in standardized space. For a randomly selected training observation i , one of its k nearest neighbors j in the predictor space was identified, and a synthetic point was generated by linear interpolation (Equation (10)):
x n e w = x i + λ x j x i   w i t h   λ 0,1 ,
with the corresponding target vector interpolated and perturbed by Gaussian noise proportional to the standardized target dispersion. Synthetic predictors were clamped to the observed training bounds to avoid extrapolation. Each synthetic sample inherited the index of its anchor observation i , and during GroupKFold cross-validation all members of a synthetic family were assigned to the same fold to prevent leakage. When supported by the estimator, synthetic samples were down-weighted to reduce their influence relative to real observations. In practice, augmentation did not improve generalization and was therefore not included in the final analysis; it is documented here solely as an ablation.
The parameters of the SMOTE-like augmentation procedure are summarized in Table 3. The total number of synthetic samples ( n syn ) was set to 200 or 1000 per training split, providing either modest or substantial expansion of the training set. The neighborhood size k , which controls the locality of interpolation in predictor space, varied between 5 and 7, reflecting the need to balance local fidelity with sufficient variability in a dataset of only 15 observations. Perturbation of the interpolated targets was introduced through Gaussian noise with a standard deviation equal to 2% or 5% of the corresponding target’s empirical standard deviation, thereby mimicking experimental uncertainty without overwhelming the signal. Synthetic predictors were clamped to the observed range of the real training data to prevent extrapolation beyond the empirical design space. To avoid leakage, all synthetic points generated from a given anchor observation were grouped together and forced into the same fold during GroupKFold cross-validation. When the estimator supported observation weights, synthetic samples were down-weighted to 20% of the influence of real observations, thereby preventing synthetic points from dominating the fit. All interpolation and perturbation were performed in standardized space for both predictors and responses to ensure comparability across variables with different scales.

2.12. Metrics

The performance of the regression models was assessed using three standard evaluation metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R2). These indices quantify the discrepancy between the predicted values ( y ^ i ) and the corresponding experimental observations ( y i ) across n samples.
MAE reflects the average magnitude of prediction errors, irrespective of their direction, and is computed as the mean of the absolute differences between actual and predicted values (Equation (11)):
MAE = 1 n i = 1 n y i y i ^
The RMSE is defined as the square root of the Mean Squared Error (MSE). Unlike MAE, RMSE penalizes larger deviations more heavily due to the squaring of residuals. Importantly, RMSE expresses the prediction error in the same units as the response variable, which makes it more directly interpretable in practical terms (Equation (12)):
RMSE = 1 n i = 1 n y i y i ^ 2
The R2 expresses the proportion of variance in the observed data that can be explained by the independent variables of the model. An R2 value of 1 corresponds to a perfect fit, whereas a value of 0 indicates that the model fails to account for any of the variability in the response (Equation (13)). It is calculated as:
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y ¯ 2
where y i are the experimental values, y ^ i the predicted values, and y ¯ the mean of the observed data.
Taken together, MAE, RMSE, and R2 provide complementary perspectives on model performance, enabling a balanced comparison of predictive accuracy, error variability, and variance explained across the machine learning regressors examined.

2.13. Rationale for the Final Regressor and Interpretability

RF was selected as the working regressor because it achieved the strongest average performance on the held-out test set, remained competitive under nested leave-one-out validation, and required minimal distributional assumptions while capturing interactions and nonlinearities among C, E, and t. Forests were trained with 500–600 trees and tuned over depth and leaf size; in nested LOOCV, the mode of selected configurations favored moderate depth (max_depth between 6 and 10 with min_samples_leaf = 1), with occasional preference for unrestricted depth in TAC. Interpretation and diagnostic analyses included native feature importance, permutation importance on the real test set, partial dependence and individual conditional expectation profiles for each predictor–response pair, and parity and residual plots. These materials substantiate the plausibility of the learned relationships and identify regions where additional experiments would most reduce uncertainty.

3. Results and Discussion

3.1. Optimization of UAE Parameters

Optimizing extraction parameters was essential to improve efficacy and promote an environmentally conscious extraction procedure [25,26]. The solvent composition was deemed crucial, as its characteristics substantially affect chemical extraction [27]. Moreover, cavitation is affected by the solvent’s physical properties, such as its viscosity, surface tension, and saturation vapor pressure [28]. Water is an ecological solvent, distinguished by its exceptional capacity to extract polar compounds, including water-soluble pigments, along with its cost-effectiveness and non-toxic characteristics for human consumption. Conversely, ethanol, a food-grade solvent, was also employed [27]. A 50% v/v aqueous ethanol solution was also evaluated. Water was evaluated against ethanol as a solvent for the optimum extraction of these chemicals, hence supporting the non-toxic and food-grade suitability of the extracts [19]. Ultrasonication power and extraction duration were also investigated. Table 4 illustrates the impact of the investigated variables on the investigated responses, whereas Table 5 displays the ANOVA results applied to the RSM quadratic polynomial model.

3.2. Model Analysis

Regression models describing the extraction process are presented in Equations (14)–(19). These models predict key response variables, including TPC, FRAP, DPPH, AAC, TFC, and TAC, and retain only statistically significant terms. The coefficients indicate how solvent composition, temperature, and extraction time influence process efficiency. The inclusion of linear and quadratic terms highlights nonlinear relationships that define the conditions maximizing antioxidant recovery. Interaction terms further demonstrate that the combined effects of multiple variables shape antioxidant potential, underscoring the importance of precise parameter optimization. Prolonged extraction enhances the dissolution of antioxidant compounds into the solvent; however, the quadratic terms (X32) and interactions (X1X2, X1X3 and X2X3) suggest that extraction time has an optimal range—too short a duration limits compound release, while excessive exposure may lead to degradation or reduced efficiency. These interactions confirm that extraction time acts synergistically with other factors rather than in isolation.
TPC = 81.84 + 2.28X1 + 3.32X2 − 2.44X3 − 0.021X12 − 0.034X22 − 0.200X32 − 0.014X1X2 − 0.003X1X3 + 0.126X2X3
TFC = 185.49 + 0.81X1 − 3.28X2 + 0.78X3 − 0.012X12 + 0.022X22 − 0.014X32 − 0.001X1X2 + 0.009X1X3 − 0.009X2X3
TAC = 536.28 + 5.77X1 + 2.91X2 + 42.88X3 − 0.121X12 − 0.028X22 − 1.233X32 + 0.033X1X2 + 0.044X1X3 − 0.114X2X3
AAC = 1.60 − 0.08X1 + 0.33X2 − 0.04X3 − 0.0003X12 − 0.002X22 − 0.013X32 + 0.0002X1X2 + 0.0008X1X3 + 0.004X2X3
FRAP = 4430.24 + 26.32X1 − 82.99X2 + 66.26X3 − 0.303X12 + 0.521X22 − 0.579X32 + 0.007X1X2 − 0.241X1X3 − 0.285X2X3
DPPH = 620.50 + 26.82X1 + 2.89X2 + 48.28X3 − 0.334X12 − 0.033X22 − 0.884X32 + 0.078X1X2 − 0.353X1X3 + 0.060X2X3
Figure 6 illustrates the influence of each parameter and their combinations on the responses of the parameters being examined. Overall, these visualizations highlight the nonlinear behavior of antioxidant compound recovery and underscore the importance of extraction parameters optimization. Figure 6A–C illustrates that TPC exhibits the highest response to moderate ethanol concentrations and ultrasonic power. High ethanol concentration (100%) significantly inhibited TPC yield, underscoring the necessity for solvent polarity to align with the chemical profile to facilitate effective extraction. Moreover, extended extraction duration resulted in increased TPC yields, presumably owing to enhanced cell disintegration and diffusion kinetics. A similar pattern was observed in Figure 6G–I,M–O, referring to DPPH and TAC, respectively. In Figure 6D–F, the influence of extraction parameters on FRAP are displayed. FRAP also necessitates moderate polarity, but it seems that lower power and short extraction duration. AAC (Figure 6P–R) follows the same path with the only difference being that longer durations favor efficacy. TFC (Figure 6J–L) necessitates moderate polarity, power and duration. These complications may arise from the different structural characteristics of antioxidant molecules and their differing sensitivity to treatment conditions.
Table 6 presents the anticipated ideal values of UAE parameters, as well as the expected TPC, FRAP, DPPH, TFC, TAC and AAC values, including the models’ desirability, which all indicate a satisfactory fit. It should be noted that different assays have different optimal parameter choices, which is nevertheless an important finding. For instance, for AAC the optimal solvent is water, while for all the other responses is an aqueous ethanol solution. For this reason, more statistical analysis of the data is of paramount importance in order to find if there are more correlations between the responses. In this way it will be possible to find a set of optimal parameters for all responses at the same time, leading to the higher performances possible for all responses simultaneously.

3.3. Impact of Extraction Parameters to Assays Through Pareto Plot Analysis

In the Pareto plot (Figure 7), the orthogonal estimate represents a statistical approach that evaluates the relative impact of individual factors while minimizing interdependence among them. This technique improves interpretability by generating statistically independent estimates, thereby facilitating the identification of the most influential variables. Orthogonal estimates are widely applied in regression analysis and experimental design to enhance parameter precision, reduce estimation bias, and mitigate distortions arising from factor interactions. The analysis clearly demonstrates that ethanol concentration exerts a significant negative effect on all response variables.

3.4. PCA and MCA

A Principal Component Analysis (PCA) was performed to enhance data interpretation and reveal latent patterns among the investigated variables (Figure 8). The model exhibited high robustness, capturing 92.8% of the variance, thereby confirming its suitability for dimensionality reduction and process optimization. The objective of the correlation analyses was to ascertain whether any relationship existed between TPC, FRAP, DPPH, TFC, TAC and AAC with the variables examined. The influence of independent variables on the analysis was considered significant. Once again, the negative effect of ethanol concentration is confirmed with all responses, and especially with TPC and AAC. This is mainly due to the chemical structure of these substances. Furthermore, it is clearly shown that TPC and AAC have a positive correlation with each other, but not with all other responses, while these two also have a negative correlation with extraction duration. FRAP, DPPH, TFC and TAC positively correlate one with another but also with extraction duration, but the ultrasonic power has a negative effect on all of them. This negative correlation is probably due to the fact that high power ultrasonic power can lead to the breakdown of these compounds, and perhaps moderate or low power is needed to avoid such an occurrence.
Multiple Correspondence Analysis (MCA) provided additional insights into the correlation patterns among the investigated variables, highlighting both positive and negative associations. Such analysis supports a more comprehensive interpretation of the dataset and contributes to process optimization. The results are reported in Table 7. It is clear that FRAP and DPPH, the two main antioxidant assays studied in this study, are in full agreement with each other and show a very strong correlation. A strong positive correlation is also observed between FRAP and TFC, TFC and TAC. AAC is the only response that demonstrates a poor correlation to all the other responses, especially with FRAP and DPPH.

3.5. Partial Least Squares (PLS) Analysis

To define the optimal conditions of all parameters simultaneously, a PLS model was applied, and the results are visible in Figure 9. The optimal conditions were found to be 33% v/v aqueous ethanol, 60% ultrasonic power for duration of 15 min, and the model displays a desirability of 0.8576, indicating a good fit of the model. On Table 8 the expected values of the responses according to the PLS model, together with the actual experimental values under the optimal conditions.
The experimental data closely matched the predictions made by the PLS model, demonstrated by a strong correlation coefficient of 0.988 and a high coefficient of determination (R2) of 0.975. Additionally, the very low p-value < 0.0001 confirms that the differences between observed and predicted values are statistically insignificant, indicating the model’s excellent predictive reliability.
The optimal extraction conditions yielded a TPC of 195.55 mg GAE/g dw, closely matching the predicted value. In another study [29], a TPC of 125.36 mg GAE/g was obtained through Soxhlet extraction using acetone as solvent, while Kanlayavattanakul et al. [30] reported 118.73 mg GAE/g for PPs. In both cases, the TPC achieved in the present study is substantially higher, underscoring the superior efficiency of ultrasound bath-assisted extraction (UAE). This finding is consistent with earlier optimization studies on pomegranate peel using UAE, which also reported high phenolic and anthocyanin recoveries [31]. Likewise, TFC appears to be favored by UAE, as the present value surpasses the 51.52 mg RtE/g dw reported by Elfalleh et al. [32]. Alternative green techniques, such as microwave-assisted extraction, have also been applied to pomegranate peel, yielding comparable phenolic profiles [33]. For AAC, Li et al. [34] determined a value of 0.99 mg/g, which is markedly lower than that obtained in the present study. Taken together, these comparisons highlight UAE as one of the most effective and sustainable green extraction techniques for valorizing pomegranate peel.

3.6. Performance of Machine Learning Regressors

The comparative evaluation of the regression models (Figure 10) revealed clear differences in their ability to predict antioxidant responses from the process variables. Ensemble tree methods, RF and ET, consistently outperformed all other approaches, yielding the highest determination coefficients and the lowest error values. RF achieved a R 2 of 0.707, with corresponding mean MAE and RMSE values of 86.58 and 116.51, respectively. ET followed closely with a macro-averaged R 2 of 0.701, MAE of 87.36, and RMSE of 113.41. At the level of individual responses, RF attained R 2 values ranging from 0.493 (TPC) to 0.837 (TFC), while ET ranged from 0.393 (TPC) to 0.839 (TFC). Both models exhibited particularly strong performance for FRAP, DPPH, TFC, TAC, and AAC, highlighting their ability to capture complex nonlinear interactions among extraction parameters.
By contrast, linear and kernel-based methods performed markedly worse. The ridge-regularized response surface model reached a macro-averaged R 2 of 0.216, while PLS fell below zero with 0.054 . SVR and KNN produced only marginally positive determination coefficients R 2 of 0.073 and R 2 of 0.051, respectively, and exhibited high error levels. The MLP surpassed linear baselines but remained limited at a R 2 of 0.315, reflecting the challenges of training neural networks on very small datasets.
Overall, the analysis demonstrates that ensemble tree regressors are the most appropriate choice for this experimental context. RF, in particular, not only provided the highest mean accuracy and robust performance across most responses but also offered interpretability through variable importance indices and partial dependence profiles. For these reasons, RF was selected as the final working model, delivering a strong and realistic predictive framework for the recovery of antioxidant compounds and confirming the potential of AI to complement and enhance classical statistical optimization. From an industrial perspective, this comparative analysis highlights the superiority of ensemble methods as robust surrogates for process simulators. Their ability to generalize from limited experimental runs suggests that AI-driven regressors can serve as digital twins for extraction processes, enabling real-time monitoring and adaptive control in manufacturing environments.

3.7. Feature Importance Analysis Across RF-Based Model

The feature importance analysis of the RF regressor consistently identified C as the most influential predictor among the three process variables. According to the Gini importance (Figure 11), C accounted for the largest proportion of impurity reduction (≈0.79), while E and t contributed more modestly (≈0.10–0.12). The permutation-based evaluation on the held-out test set (Figure 12), which quantifies the increase in prediction error when variable values are randomized, supported this trend. Randomizing C led to a marked deterioration in predictive performance (ΔMSE ≈ 1.75), whereas E showed only a minor effect (ΔMSE ≈ 0.04) and t had no detectable contribution within the experimental range.
The general agreement between Gini and permutation importance strengthens the conclusion that C is the primary driver of the extraction responses modeled here. At the same time, the comparatively small dataset and possible correlations among predictors warrant caution: importance estimates may be sensitive to the experimental design and should be interpreted as indicative rather than definitive. Future work with expanded data could further validate these patterns and explore potential interaction effects that may not be fully resolved in the present study.

3.8. Actual vs. Predicted Performance RF-Based Model

Figure 13 presents parity plots comparing the values predicted by the RF regression model with the corresponding experimentally reported values for six antioxidant response variables, TPC, FRAP, DPPH, TFC, TAC, and AAC. In each panel, the predicted values are plotted against the reported real values, with the red dashed line denoting the line of identity (y = x). The degree of alignment of the data points with this line provides a direct visual assessment of model accuracy.
The R2 is displayed in each subplot as a quantitative measure of predictive performance. Across the evaluated targets, the RF regressor demonstrated consistently high predictive agreement with reported values, with R2 exceeding 0.90 for TPC, FRAP, TFC, and AAC, and slightly lower but still substantial explanatory power for TAC (R2 = 0.83) and DPPH (R2 = 0.70). The proximity of most points to the diagonal identity line further underscores the robustness of the model in capturing the intrinsic relationships between the predictor variables and the antioxidant activity outcomes under investigation.
Collectively, this visualization evidences that the RF approach achieves strong predictive fidelity for most antioxidant-related traits, thereby supporting its suitability as a regression framework for multivariate biochemical modeling in food and nutritional science.

3.9. Partial Dependence Analysis of RF-Based Model

Partial dependence plots (PDP) with individual conditional expectation (ICE) curves were generated to elucidate the marginal effects of ethanol concentration, ultrasonic power, and extraction time on the predictive responses of the RF model across all biochemical targets (Figure 14).
For TPC (Figure 14A), C was the dominant factor, showing an initial increase followed by a decline at higher levels, while E induced a gradual negative shift and t displayed a slight positive effect. For FRAP (Figure 14B), C again emerged as the primary driver, producing sharp positive effects at low levels followed by a decline at higher concentrations, whereas E and t comparatively minor influences. For DPPH radical scavenging activity (Figure 14C), a similar pattern was observed: C strongly shaped the response with nonlinear behavior, while E and t produced flatter or weakly modulating effects. For TFC (Figure 14D), C exhibited a clear positive dependence up to moderate values, after which the trend declined, indicating an optimal range. E and t contributed only limited influence. For TAC (Figure 14E), C followed a comparable nonlinear trajectory, with increases at lower levels and decreases at higher ones, while E and t again played secondary roles. Finally, for AAC (Figure 14F), C showed a strong negative association, with higher concentrations reducing the predicted response, whereas E and t time had comparatively minor and near-linear effects.
Collectively, these PDP/ICE results (Figure 14) confirm that C is the most influential extraction parameter across all antioxidant traits, while E and t exert secondary or negligible roles. This reinforces the primacy of solute concentration in governing the recovery and predictive estimation of antioxidant activities within the studied system. Such interpretability analyses are critical for industrial adoption, as they provide transparent cause-effect relationships between process parameters and outcomes. This transparency facilitates operator trust, regulatory compliance, and integration of AI models into cyber-physical production systems for intelligent process optimization.

3.10. Model Prediction Accuracy at Optimal Conditions

Figure 15 compares the experimentally reported values with the RF model predictions for six antioxidants endpoints: TPC, FRAP, DPPH, AAC, TFC, and TAC. Bars represent the mean values under optimal extraction conditions, and error bars denote the standard deviations.
Overall, the RF regressor demonstrated a high degree of concordance with the reported experimental values across all targets. For TPC, the predicted mean (147.6 ± 45.8) closely matched the reported value (145.9 ± 61.0). Similarly, FRAP predictions (1614.8 ± 446.3) aligned well with the observed data (1599.8 ± 436.5), and DPPH was also well approximated, with the RF estimate (1211.3 ± 483.4) closely following the experimental measurement (1157.6 ± 457.9). Predictions for AAC (10.4 ± 3.3) were in close agreement with the reported mean (10.6 ± 2.9), while TFC (46.9 ± 19.0) remained consistent with its observed value (46.4 ± 16.3). Finally, TAC was also accurately predicted (722.6 ± 223.2 versus 723.9 ± 193.7).
Taken together, these results confirm that the RF model is capable of reliably reproducing the experimental responses under optimal extraction conditions, with all six antioxidant indicators predicted within experimental variability.

4. Conclusions

In the present study, the initial objective was to optimize the UAE parameters for the extraction of various antioxidant compounds from PPs. After extensive statistical analysis, optimal conditions were found. These conditions were defined as 33% v/v aqueous ethanol at 60% ultrasonic power for 15 min, and they seem to have an excellent performance on PPs. These results can serve as a basis for further study and ultimately utilization of PPs by pharmaceutical, cosmetic, food supplement, food and feed industries.
The RF regressor demonstrated strong potential for modeling and predicting the antioxidant and phytochemical responses of PP extracts under UAE conditions. Across all six target variables—TPC, FRAP, DPPH, AAC, TFC, and TAC—the RF model was able to reproduce reported experimental outcomes within the range of experimental variability, highlighting its predictive reliability under optimal conditions. Partial dependence analyses further revealed that ethanol solvent concentration was the dominant factor influencing the responses, while ultrasonic power and extraction time played comparatively minor roles.
Future research should prioritize the expansion of experimental datasets to enhance model training and validation, the application of advanced generative methods for realistic data augmentation, and the execution of real-world testing of model predictions. Furthermore, assessing model transferability across various plant matrices and extraction systems, along with validating the process at an industrial scale, will be crucial for the wider practical application of AI-assisted extraction optimization. In addition, positioning the RF-based framework as a decision-support system underscores its potential role in industrial digitalization strategies. By bridging green extraction technologies with AI-driven optimization, the present study contributes to the broader agenda of intelligent, sustainable, and scalable process engineering.

Author Contributions

Conceptualization, V.A. and S.I.L.; methodology, V.A., software, V.A.; validation, V.A.; formal analysis, M.M. and V.A.; investigation, M.M. and E.B.; resources, S.I.L.; data curation, M.M. and K.G.L.; writing—original draft preparation, M.M. and K.G.L.; writing—review and editing, V.A., M.M., K.G.L., E.B. and S.I.L.; visualization, M.M. and K.G.L.; supervision, V.A. and S.I.L.; project administration, S.I.L.; funding acquisition, S.I.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ain, H.B.U.; Tufail, T.; Bashir, S.; Ijaz, N.; Hussain, M.; Ikram, A.; Farooq, M.A.; Saewan, S.A. Nutritional Importance and Industrial Uses of Pomegranate Peel: A Critical Review. Food Sci. Nutr. 2023, 11, 2589–2598. [Google Scholar] [CrossRef]
  2. El-Kady, A.M.; Abdel-Rahman, I.A.M.; Fouad, S.S.; Allemailem, K.S.; Istivan, T.; Ahmed, S.F.M.; Hasan, A.S.; Osman, H.A.; Elshabrawy, H.A. Pomegranate Peel Extract Is a Potential Alternative Therapeutic for Giardiasis. Antibiotics 2021, 10, 705. [Google Scholar] [CrossRef]
  3. Drinić, Z.; Mudrić, J.; Zdunić, G.; Bigović, D.; Menković, N.; Šavikin, K. Effect of Pomegranate Peel Extract on the Oxidative Stability of Pomegranate Seed Oil. Food Chem. 2020, 333, 127501. [Google Scholar] [CrossRef]
  4. Kaderides, K.; Kyriakoudi, A.; Mourtzinos, I.; Goula, A.M. Potential of Pomegranate Peel Extract as a Natural Additive in Foods. Trends Food Sci. Technol. 2021, 115, 380–390. [Google Scholar] [CrossRef]
  5. Moghadam, M.; Salami, M.; Mohammadian, M.; Khodadadi, M.; Emam-Djomeh, Z. Development of Antioxidant Edible Films Based on Mung Bean Protein Enriched with Pomegranate Peel. Food Hydrocoll. 2020, 104, 105735. [Google Scholar] [CrossRef]
  6. Azmat, F.; Safdar, M.; Ahmad, H.; Khan, M.R.J.; Abid, J.; Naseer, M.S.; Aggarwal, S.; Imran, A.; Khalid, U.; Zahra, S.M.; et al. Phytochemical Profile, Nutritional Composition of Pomegranate Peel and Peel Extract as a Potential Source of Nutraceutical: A Comprehensive Review. Food Sci. Nutr. 2024, 12, 661–674. [Google Scholar] [CrossRef] [PubMed]
  7. Kumar, N.; Daniloski, D.; Pratibha, N.; D’Cunha, N.M.; Naumovski, N.; Petkoska, A.T. Pomegranate Peel Extract–A Natural Bioactive Addition to Novel Active Edible Packaging. Food Res. Int. 2022, 156, 111378. [Google Scholar] [CrossRef]
  8. Ben-Ali, S. Application of Raw and Modified Pomegranate Peel for Wastewater Treatment: A Literature Overview and Analysis. Int. J. Chem. Eng. 2021, 2021, 8840907. [Google Scholar] [CrossRef]
  9. Xiang, Q.; Li, M.; Wen, J.; Ren, F.; Yang, Z.; Jiang, X.; Chen, Y. The Bioactivity and Applications of Pomegranate Peel Extract: A Review. J. Food Biochem. 2022, 46, e14105. [Google Scholar] [CrossRef]
  10. Abu-Niaaj, L.F.; Al-Daghistani, H.I.; Katampe, I.; Abu-Irmaileh, B.; Bustanji, Y.K. Pomegranate Peel: Bioactivities as Antimicrobial and Cytotoxic Agents. Food Sci. Nutr. 2024, 12, 2818–2832. [Google Scholar] [CrossRef]
  11. Carpentieri, S.; Soltanipour, F.; Ferrari, G.; Pataro, G.; Donsì, F. Emerging Green Techniques for the Extraction of Antioxidants from Agri-Food By-Products as Promising Ingredients for the Food Industry. Antioxidants 2021, 10, 1417. [Google Scholar] [CrossRef]
  12. Athanasiadis, V.; Mantiniotou, M.; Kalompatsios, D.; Makrygiannis, I.; Alibade, A.; Lalas, S.I. Evaluation of Antioxidant Properties of Residual Hemp Leaves Following Optimized Pressurized Liquid Extraction. AgriEngineering 2025, 7, 1. [Google Scholar] [CrossRef]
  13. Pereira, T.C.; Souza, V.P.; Padilha, A.P.F.; Duarte, F.A.; Flores, E.M. Trends and Perspectives on the Ultrasound-Assisted Extraction of Bioactive Compounds Using Natural Deep Eutectic Solvents. Curr. Opin. Chem. Eng. 2025, 47, 101088. [Google Scholar] [CrossRef]
  14. Rao, M.V.; Sengar, A.S.; Rawson, A. Ultrasonication-A Green Technology Extraction Technique for Spices: A Review. Trends Food Sci. Technol. 2021, 116, 975–991. [Google Scholar] [CrossRef]
  15. Guddi, K.; Sarkar, A. Optimization of Green Extraction Technologies for Recovering Bioactive Compounds from Ixora coccinea Waste Flower Biomass: A Comparative Response Surface Methodology and Artificial Neural Network Modeling. Sustain. Chem. Pharm. 2024, 42, 101830. [Google Scholar] [CrossRef]
  16. Şahin, S.; Kurtulbaş, E. Green Extraction and Valorization of By-Products from Food Processing. Foods 2024, 13, 1589. [Google Scholar] [CrossRef]
  17. Harikrishnan, S.; Kaushik, D.; Rasane, P.; Kumar, A.; Kaur, N.; Reddy, C.K.; Proestos, C.; Oz, F.; Kumar, M. Artificial Intelligence in Sustainable Food Design: Technological, Ethical Consideration, and Future. Trends Food Sci. Technol. 2025, 163, 105152. [Google Scholar] [CrossRef]
  18. Datta, B.; Buehler, M.J.; Chow, Y.; Gligoric, K.; Jurafsky, D.; Kaplan, D.L.; Ledesma-Amaro, R.; Missier, G.D.; Neidhardt, L.; Pichara, K.; et al. AI for Sustainable Future Foods. arXiv 2025, arXiv:2509.21556. [Google Scholar] [CrossRef]
  19. Mantiniotou, M.; Athanasiadis, V.; Liakos, K.G.; Bozinou, E.; Lalas, S.I. Artificial Intelligence and Extraction of Bioactive Compounds: The Case of Rosemary and Pressurized Liquid Extraction. Processes 2025, 13, 1879. [Google Scholar] [CrossRef]
  20. Zhang, Y.; Bao, X.; Zhu, Y.; Dai, Z.; Shen, Q.; Xue, Y. Advances in Machine Learning Screening of Food Bioactive Compounds. Trends Food Sci. Technol. 2024, 150, 104578. [Google Scholar] [CrossRef]
  21. Hu, J.; Li, H.; Liu, J.; Du, S. Review of Intelligent Modeling for Sintering Process Under Variable Operating Conditions. Processes 2025, 13, 180. [Google Scholar] [CrossRef]
  22. Athanasiadis, V.; Chatzimitakos, T.; Mantiniotou, M.; Kalompatsios, D.; Bozinou, E.; Lalas, S.I. Investigation of the Polyphenol Recovery of Overripe Banana Peel Extract Utilizing Cloud Point Extraction. Eng 2023, 4, 3026–3038. [Google Scholar] [CrossRef]
  23. Lee, J.; Durst, R.W.; Wrolstad, R.E. Collaborators: Determination of Total Monomeric Anthocyanin Pigment Content of Fruit Juices, Beverages, Natural Colorants, and Wines by the pH Differential Method: Collaborative Study. J. AOAC Int. 2005, 88, 1269–1278. [Google Scholar] [CrossRef]
  24. Athanasiadis, V.; Chatzimitakos, T.; Mantiniotou, M.; Bozinou, E.; Lalas, S.I. Exploring the Antioxidant Properties of Citrus Limon (Lemon) Peel Ultrasound Extract after the Cloud Point Extraction Method. Biomass 2024, 4, 202–216. [Google Scholar] [CrossRef]
  25. Chemat, F.; Rombaut, N.; Sicaire, A.-G.; Meullemiestre, A.; Fabiano-Tixier, A.-S.; Abert-Vian, M. Ultrasound Assisted Extraction of Food and Natural Products. Mechanisms, Techniques, Combinations, Protocols and Applications. A Review. Ultrason. Sonochem. 2017, 34, 540–560. [Google Scholar] [CrossRef] [PubMed]
  26. Frosi, I.; Montagna, I.; Colombo, R.; Milanese, C.; Papetti, A. Recovery of Chlorogenic Acids from Agri-Food Wastes: Updates on Green Extraction Techniques. Molecules 2021, 26, 4515. [Google Scholar] [CrossRef] [PubMed]
  27. Kalompatsios, D.; Athanasiadis, V.; Mantiniotou, M.; Lalas, S.I. Optimization of Ultrasonication Probe-Assisted Extraction Parameters for Bioactive Compounds from Opuntia macrorhiza Using Taguchi Design and Assessment of Antioxidant Properties. Appl. Sci. 2024, 14, 10460. [Google Scholar] [CrossRef]
  28. Picot-Allain, C.; Mahomoodally, M.F.; Ak, G.; Zengin, G. Conventional versus Green Extraction Techniques—a Comparative Perspective. Curr. Opin. Food Sci. 2021, 40, 144–156. [Google Scholar] [CrossRef]
  29. Abdu, O.H.; Saeed, A.A.M.; Fdhel, T.A. Polyphenols/Flavonoids Analysis and Antimicrobial Activity in Pomegranate Peel Extracts. Electron. J. Univ. Aden Basic Appl. Sci. 2020, 1, 14–19. [Google Scholar] [CrossRef]
  30. Kanlayavattanakul, M.; Chongnativisit, W.; Chaikul, P.; Lourith, N. Phenolic-Rich Pomegranate Peel Extract: In Vitro, Cellular, and In Vivo Activities for Skin Hyperpigmentation Treatment. Planta Med. 2020, 86, 749–759. [Google Scholar] [CrossRef]
  31. Rababah, T.M.; Banat, F.; Rababah, A.; Ereifej, K.; Yang, W. Optimization of Extraction Conditions of Total Phenolics, Antioxidant Activities, and Anthocyanin of Oregano, Thyme, Terebinth, and Pomegranate. J. Food Sci. 2010, 75, C626–C632. [Google Scholar] [CrossRef]
  32. Elfalleh, W.; Hannachi, H.; Tlili, N.; Yahia, Y.; Nasri, N.; Ferchichi, A. Total Phenolic Contents and Antioxidant Activities of Pomegranate Peel, Seed, Leaf and Flower. J. Med. Plants Res. 2012, 6, 4724–4730. [Google Scholar] [CrossRef]
  33. Kaderides, K.; Papaoikonomou, L.; Serafim, M.; Goula, A.M. Microwave-Assisted Extraction of Phenolics from Pomegranate Peels: Optimization, Kinetics, and Comparison with Ultrasounds Extraction. Chem. Eng. Process.-Process Intensif. 2019, 137, 1–11. [Google Scholar] [CrossRef]
  34. Li, Y.; Guo, C.; Yang, J.; Wei, J.; Xu, J.; Cheng, S. Evaluation of Antioxidant Properties of Pomegranate Peel Extract in Comparison with Pomegranate Pulp Extract. Food Chem. 2006, 96, 254–260. [Google Scholar] [CrossRef]
Figure 1. Distributions of process variables (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Each panel shows a histogram with kernel-density overlay in original units. Red lines represent smoothed kernel density estimates, highlighting the underlying distribution shape of each variable. Sampling across the process factors is broadly balanced with limited mass at extreme settings, whereas responses differ markedly in variance—highest for FRAP and DPPH and lowest for AAC and TFC—supporting the use of scaling and per-target evaluation in subsequent modeling.
Figure 1. Distributions of process variables (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Each panel shows a histogram with kernel-density overlay in original units. Red lines represent smoothed kernel density estimates, highlighting the underlying distribution shape of each variable. Sampling across the process factors is broadly balanced with limited mass at extreme settings, whereas responses differ markedly in variance—highest for FRAP and DPPH and lowest for AAC and TFC—supporting the use of scaling and per-target evaluation in subsequent modeling.
Processes 13 03700 g001
Figure 2. Combined boxplots of predictors (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC) in original units. Boxes denote interquartile ranges with medians; whiskers extend to 1.5 × IQR. Predictors show moderate dispersion with no severe outliers, while responses vary markedly in scale, with FRAP and DPPH the most variable and AAC and TFC the most compact, motivating target standardization and per-target evaluation.
Figure 2. Combined boxplots of predictors (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC) in original units. Boxes denote interquartile ranges with medians; whiskers extend to 1.5 × IQR. Predictors show moderate dispersion with no severe outliers, while responses vary markedly in scale, with FRAP and DPPH the most variable and AAC and TFC the most compact, motivating target standardization and per-target evaluation.
Processes 13 03700 g002
Figure 3. Data matrix heatmap in original units for predictors (C, E, t) and responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Color intensity reflects raw value magnitude. Large dynamic ranges in FRAP, DPPH, and TAC dominate the palette, whereas AAC and TFC vary over narrower ranges.
Figure 3. Data matrix heatmap in original units for predictors (C, E, t) and responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Color intensity reflects raw value magnitude. Large dynamic ranges in FRAP, DPPH, and TAC dominate the palette, whereas AAC and TFC vary over narrower ranges.
Processes 13 03700 g003
Figure 4. Data matrix heatmap after standardization (z-scores). Each cell shows the standardized deviation from the variable mean, enabling cross-variable comparison. Coherent profiles emerge across FRAP, DPPH, and, to a lesser extent, TAC, while AAC is comparatively uniform, consistent with earlier dispersion summaries.
Figure 4. Data matrix heatmap after standardization (z-scores). Each cell shows the standardized deviation from the variable mean, enabling cross-variable comparison. Coherent profiles emerge across FRAP, DPPH, and, to a lesser extent, TAC, while AAC is comparatively uniform, consistent with earlier dispersion summaries.
Processes 13 03700 g004
Figure 5. Pearson correlation matrix for predictors (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Warm colors denote positive correlations and cool colors negative correlations. A strong positive block is evident among FRAP, DPPH, TFC, and TAC, with TPC moderately aligned and AAC weaker connected; C is negatively correlated with all responses, while E and t show little linear association.
Figure 5. Pearson correlation matrix for predictors (C, E, t) and antioxidant responses (TPC, FRAP, DPPH, TFC, TAC, AAC). Warm colors denote positive correlations and cool colors negative correlations. A strong positive block is evident among FRAP, DPPH, TFC, and TAC, with TPC moderately aligned and AAC weaker connected; C is negatively correlated with all responses, while E and t show little linear association.
Processes 13 03700 g005
Figure 6. For TPC, plot (A) represents the covariation of X1 (ethanol concentration, C, % v/v) and X2 (ultrasonic power, E, %); plot (B) shows the covariation of X1 and X3 (extraction time, t, min); plot (C) illustrates the covariation of X2 and X3. For FRAP, plot (D) shows the covariation of X1 and X2; plot (E) presents the covariation of X1 and X3; plot (F) illustrates the covariation of X2 and X3. For DPPH, plot (G) represents the covariation of X1 and X2; plot (H) depicts the covariation of X1 and X3; plot (I) illustrates the covariation of X2 and X3. For TFC, plot (J) represents the covariation of X1 and X2; plot (K) depicts the covariation of X1 and X3; plot (L) illustrates the covariation of X2 and X3. For TAC, plot (M) represents the covariation of X1 and X2; plot (N) depicts the covariation of X1 and X3; plot (O) illustrates the covariation of X2 and X3. For AAC, plot (P) represents the covariation of X1 and X2; plot (Q) depicts the covariation of X1 and X3; plot (R) illustrates the covariation of X2 and X3. The surface color gradient ranges from blue (lower response values) to red (higher response values), indicating the magnitude of each response variable across the design space. Purple lines highlight key response features such as ridges, optima, or contour intersections, aiding visual interpretation of variable interactions.
Figure 6. For TPC, plot (A) represents the covariation of X1 (ethanol concentration, C, % v/v) and X2 (ultrasonic power, E, %); plot (B) shows the covariation of X1 and X3 (extraction time, t, min); plot (C) illustrates the covariation of X2 and X3. For FRAP, plot (D) shows the covariation of X1 and X2; plot (E) presents the covariation of X1 and X3; plot (F) illustrates the covariation of X2 and X3. For DPPH, plot (G) represents the covariation of X1 and X2; plot (H) depicts the covariation of X1 and X3; plot (I) illustrates the covariation of X2 and X3. For TFC, plot (J) represents the covariation of X1 and X2; plot (K) depicts the covariation of X1 and X3; plot (L) illustrates the covariation of X2 and X3. For TAC, plot (M) represents the covariation of X1 and X2; plot (N) depicts the covariation of X1 and X3; plot (O) illustrates the covariation of X2 and X3. For AAC, plot (P) represents the covariation of X1 and X2; plot (Q) depicts the covariation of X1 and X3; plot (R) illustrates the covariation of X2 and X3. The surface color gradient ranges from blue (lower response values) to red (higher response values), indicating the magnitude of each response variable across the design space. Purple lines highlight key response features such as ridges, optima, or contour intersections, aiding visual interpretation of variable interactions.
Processes 13 03700 g006aProcesses 13 03700 g006b
Figure 7. Pareto plots illustrating the significance of parameter estimates for the PLE technique across TPC (A), FRAP (B), DPPH (C), TFC (D), TAC (E), and AAC (F). Positive estimates are shown in blue, while negative ones are represented in red. A pink asterisk (*) marks statistically significant effect at p < 0.05. Vertical reference lines indicate significance thresholds: the dotted line corresponds to the standard p = 0.05 limit, while the solid line represents the Bonferroni-corrected threshold. Bars extending beyond these lines denote statistically significant contributions to the model.
Figure 7. Pareto plots illustrating the significance of parameter estimates for the PLE technique across TPC (A), FRAP (B), DPPH (C), TFC (D), TAC (E), and AAC (F). Positive estimates are shown in blue, while negative ones are represented in red. A pink asterisk (*) marks statistically significant effect at p < 0.05. Vertical reference lines indicate significance thresholds: the dotted line corresponds to the standard p = 0.05 limit, while the solid line represents the Bonferroni-corrected threshold. Bars extending beyond these lines denote statistically significant contributions to the model.
Processes 13 03700 g007
Figure 8. PCA for the measured variables. Each X variable (X1, X2, X3) is presented with a blue color. Red arrows indicate the direction and magnitude of each measured variable’s contribution to the principal components, with their orientation reflecting correlation and influence. The red circle represents the correlation circle, defining the maximum possible loading. Dotted blue arrows show the projection of each X variable in the PCA space, illustrating their relative positioning and contribution to the multivariate structure.
Figure 8. PCA for the measured variables. Each X variable (X1, X2, X3) is presented with a blue color. Red arrows indicate the direction and magnitude of each measured variable’s contribution to the principal components, with their orientation reflecting correlation and influence. The red circle represents the correlation circle, defining the maximum possible loading. Dotted blue arrows show the projection of each X variable in the PCA space, illustrating their relative positioning and contribution to the multivariate structure.
Processes 13 03700 g008
Figure 9. (A) Optimization of UAE conditions for PP antioxidants using a PLS-based prediction profiler and desirability function with extrapolation control. Red dashed lines indicate the optimal values of each independent variable (X1, X2, X3) that maximize the composite desirability score. Squares represent the predicted response values at these optimal conditions. (B) Variable Importance Plot (VIP) highlighting the predictors driving model performance in the UAE framework, with the 0.8 threshold (red dashed line) marking significant contributors.
Figure 9. (A) Optimization of UAE conditions for PP antioxidants using a PLS-based prediction profiler and desirability function with extrapolation control. Red dashed lines indicate the optimal values of each independent variable (X1, X2, X3) that maximize the composite desirability score. Squares represent the predicted response values at these optimal conditions. (B) Variable Importance Plot (VIP) highlighting the predictors driving model performance in the UAE framework, with the 0.8 threshold (red dashed line) marking significant contributors.
Processes 13 03700 g009
Figure 10. Comparative performance of regression models. Bars represent macro-averaged R 2 , MAE, and RMSE across all antioxidant responses. Numerical values are annotated above each bar to facilitate direct comparison. Plot (A) presents model performance on the training set, and plot (B) presents model performance on the test set.
Figure 10. Comparative performance of regression models. Bars represent macro-averaged R 2 , MAE, and RMSE across all antioxidant responses. Numerical values are annotated above each bar to facilitate direct comparison. Plot (A) presents model performance on the training set, and plot (B) presents model performance on the test set.
Processes 13 03700 g010
Figure 11. Random Forest feature importance based on the Gini criterion. Bars indicate the relative reduction in impurity attributed to each predictor variable during tree construction. Ethanol concentration (C) emerges as the most influential factor, whereas ultrasonic power (E) and extraction time (t) contribute more modestly.
Figure 11. Random Forest feature importance based on the Gini criterion. Bars indicate the relative reduction in impurity attributed to each predictor variable during tree construction. Ethanol concentration (C) emerges as the most influential factor, whereas ultrasonic power (E) and extraction time (t) contribute more modestly.
Processes 13 03700 g011
Figure 12. Random Forest permutation importance evaluated on the held-out test set. Bars represent the mean increase in prediction error (ΔMSE) when the values of each predictor variable are randomly permuted. Randomizing ethanol concentration (C) caused the largest deterioration in predictive performance, while ultrasonic power (E) and extraction time (t) showed only minor or negligible effects.
Figure 12. Random Forest permutation importance evaluated on the held-out test set. Bars represent the mean increase in prediction error (ΔMSE) when the values of each predictor variable are randomly permuted. Randomizing ethanol concentration (C) caused the largest deterioration in predictive performance, while ultrasonic power (E) and extraction time (t) showed only minor or negligible effects.
Processes 13 03700 g012
Figure 13. Parity plots comparing RF regression predictions with experimentally reported values for TPC, FRAP, DPPH, TFC, TAC, and AAC. Predicted values (y-axis) are plotted against reported values (x-axis), with the red dashed line representing the line of identity (y = x). Blue circles denote individual data points used for model training and validation. The coefficient of determination (R2) is indicated in each panel, highlighting strong predictive agreement for most response variables.
Figure 13. Parity plots comparing RF regression predictions with experimentally reported values for TPC, FRAP, DPPH, TFC, TAC, and AAC. Predicted values (y-axis) are plotted against reported values (x-axis), with the red dashed line representing the line of identity (y = x). Blue circles denote individual data points used for model training and validation. The coefficient of determination (R2) is indicated in each panel, highlighting strong predictive agreement for most response variables.
Processes 13 03700 g013
Figure 14. Partial dependence plots (PDP) with individual conditional expectation (ICE) curves showing the marginal effects of extraction concentration (C), ethanol proportion (E), and extraction time (t) on Random Forest (RF) model predictions. Response variables include: (A) total phenolic content (TPC), (B) ferric-reducing antioxidant power (FRAP), (C) DPPH radical scavenging activity (DPPH), (D) total flavonoid content (TFC), (E) total anthocyanin content (TAC), and (F) ascorbic acid content (AAC). The dashed blue line represents the average partial dependence across all samples, while light blue curves denote ICE trajectories for individual samples, capturing sample-level variability in response to each feature. Across all responses, concentration exerted the strongest influence, typically showing nonlinear patterns with initial increases followed by declines at higher levels, whereas ethanol proportion and extraction time contributed comparatively weaker and less consistent effects.
Figure 14. Partial dependence plots (PDP) with individual conditional expectation (ICE) curves showing the marginal effects of extraction concentration (C), ethanol proportion (E), and extraction time (t) on Random Forest (RF) model predictions. Response variables include: (A) total phenolic content (TPC), (B) ferric-reducing antioxidant power (FRAP), (C) DPPH radical scavenging activity (DPPH), (D) total flavonoid content (TFC), (E) total anthocyanin content (TAC), and (F) ascorbic acid content (AAC). The dashed blue line represents the average partial dependence across all samples, while light blue curves denote ICE trajectories for individual samples, capturing sample-level variability in response to each feature. Across all responses, concentration exerted the strongest influence, typically showing nonlinear patterns with initial increases followed by declines at higher levels, whereas ethanol proportion and extraction time contributed comparatively weaker and less consistent effects.
Processes 13 03700 g014aProcesses 13 03700 g014b
Figure 15. Model prediction accuracy of the Random Forest (RF) regressor at optimal extraction conditions for total phenolic content (TPC), ferric-reducing antioxidant power (FRAP), DPPH radical scavenging activity (DPPH), ascorbic acid content (AAC), total flavonoid content (TFC), and total anthocyanin content (TAC). Bars represent mean values with error bars indicating standard deviations. Across all targets, RF predictions closely matched the reported experimental values, confirming the model’s ability to reproduce biochemical responses within experimental variability.
Figure 15. Model prediction accuracy of the Random Forest (RF) regressor at optimal extraction conditions for total phenolic content (TPC), ferric-reducing antioxidant power (FRAP), DPPH radical scavenging activity (DPPH), ascorbic acid content (AAC), total flavonoid content (TFC), and total anthocyanin content (TAC). Bars represent mean values with error bars indicating standard deviations. Across all targets, RF predictions closely matched the reported experimental values, confirming the model’s ability to reproduce biochemical responses within experimental variability.
Processes 13 03700 g015
Table 1. Independent variables levels used in experimental design.
Table 1. Independent variables levels used in experimental design.
Independent VariablesCoded UnitsCoded Levels
−101
Ethanol concentration (C, % v/v)X1050100
Ultrasonic power (E, %) X26080100
Extraction time (t, min)X351525
Table 2. Machine-learning regressors and hyperparameters explored (default settings were used for all parameters not listed).
Table 2. Machine-learning regressors and hyperparameters explored (default settings were used for all parameters not listed).
ModelTuned ParametersValues Tested
RSM_Ridgeridge_alpha0.1, 1.0, 10, 100
PLSn_components1, 2, 3
KNNn_neighbors|weights3, 5, 7|uniform, distance
SVRC|epsilon|γ1, 10|0.1, 0.2|scale
MLP hidden_layer_sizes|alpha(64), (64,64), (128)|
1 × 10−4, 1 × 10−3
RFmax_depth|min_samples_leafNone, 6, 10|1, 2, 4
ETmax_depth|min_samples_leafNone, 6, 10|1, 2, 4
XGBn_estimators|max_depth|learning_rate|subsample|colsample_bytree|min_child_weight|reg_lambda|reg_alpha300, 600|3, 6|0.03, 0.10|0.7, 1.0|0.7, 1.0|1, 5|1.0, 5.0|0.0, 0.1
Table 3. Parameters of the SMOTE-like augmentation procedure used in ablation experiments.
Table 3. Parameters of the SMOTE-like augmentation procedure used in ablation experiments.
ParameterValues TestedDescription
Number of synthetic samples200, 1000Total number of synthetic points generated per training split
Nearest neighbors5, 7Number of neighbors considered in predictor space for interpolation
Target noise0.02, 0.05Gaussian noise added to interpolated targets
Clamping of predictorsTruePrevented extrapolation beyond empirical design space
Grouping in cross-validationTrueSynthetic samples inherited anchor index; all members retained in the same fold during GroupKFold
Synthetic sample weight0.2Reduced influence of synthetic points relative to real samples
Generation spaceStandardized predictors and targetsInterpolation and perturbation applied after scaling
Table 4. Experimental results for the three independent variables investigated and the corresponding responses of the dependent variables under the UAE process.
Table 4. Experimental results for the three independent variables investigated and the corresponding responses of the dependent variables under the UAE process.
Design PointIndependent
Variables
Actual UAE Responses *
C (%) (X1)E (%) (X2)t (min) (X3)TPCTFCTACAACFRAPDPPH
1010015176.51 ± 2.8665.57 ± 0.55698.61 ± 37.9113.32 ± 0.211717.18 ± 6.98956.75 ± 18.97
21006015115.83 ± 1.5739.57 ± 1.62552.49 ± 15.126.75 ± 0.481274.7 ± 5.27771.48 ± 18.26
350605185.15 ± 3.4971.8 ± 0.1870.6 ± 50.649.91 ± 0.12099.51 ± 8.351558 ± 18.36
45010025193.3 ± 5.6585.45 ± 1.01759.66 ± 48.4110.9 ± 0.32193.96 ± 6.051843.73 ± 18.48
510080572.03 ± 1.4123.61 ± 0.26325.59 ± 51.035.01 ± 0.51784.22 ± 8.33435.21 ± 1.5
608025182.19 ± 3.9651.15 ± 0.27723.76 ± 36.9713.9 ± 0.521931.04 ± 6.991650.79 ± 13.87
7100802563.61 ± 2.9826.2 ± 0.11417.88 ± 49.266.11 ± 0.021010.61 ± 3.88547.7 ± 24.67
8506025199.09 ± 1.184.08 ± 0.31827.3 ± 93.555.46 ± 0.072307.56 ± 1.471423.61 ± 18.43
950100578.8 ± 0.3380.69 ± 1.46893.78 ± 43.8812.1 ± 0.162213.82 ± 8.41930.47 ± 19.6
101001001560.73 ± 0.533.48 ± 0.63514.63 ± 77.196.49 ± 0.441005.78 ± 6.68724.91 ± 7.05
110805184.56 ± 5.166.15 ± 0.5719.42 ± 19.0614.33 ± 0.551222.61 ± 9.45832.06 ± 15.53
12508015200.46 ± 0.1970.25 ± 0.66985.33 ± 83.211.86 ± 0.262052.75 ± 7.61791.2 ± 44.31
13508015198.98 ± 3.280.16 ± 0.15968.04 ± 69.6111.44 ± 0.222067.99 ± 5.761776.7 ± 27.09
1406015174.05 ± 1.7568.6 ± 0.41869.74 ± 82.6714.26 ± 0.122015.99 ± 7.391313.83 ± 14.49
15508015194.15 ± 1.0169.24 ± 0.5963.58 ± 5.1211.99 ± 0.112039.07 ± 7.961803.34 ± 30.98
* Values represent the mean of triplicate determinations; TPC, total polyphenol content (in mg GAE/g dw); TFC, total flavonoid content (in mg RtE/g dw), TAC, total anthocyanin content (in μg CyE/g dw, due to their lower concentration compared to the other antioxidant compounds, which are reported in mg/g dw); AAC, ascorbic acid content (in mg/g dw); FRAP, ferric-reducing antioxidant power (in μmol AAE/g dw); DPPH, antiradical activity (in μmol AAE/g dw).
Table 5. Analysis of variance (ANOVA) for the quadratic polynomial response surface model applied to the UAE process.
Table 5. Analysis of variance (ANOVA) for the quadratic polynomial response surface model applied to the UAE process.
FactorTPCTFCTACAACFRAPDPPH
Least squares regression
Intercept197.9 *73.22 *972.3 *11.76 *2053 *1790 *
X1—ethanol concentration−50.6 *−16.1 *−150 *−3.93 *−351 *−284 *
X2—ultrasonic power −20.60.143−31.70.804−70.948.62
X3—extraction time14.710.579−10.1−0.62140.4 *88.76
X1X2−14.4−0.7733.320.177.47277.63
X1X3−1.514.39821.990.383−121−177
X2X325.14−1.88−22.70.813−5711.91
X12−52.3 *−30.1 *−302 *−0.66−758 *−836 *
X22−13.88.658−11.1−0.9208.4 *−13.1
X32−20−1.37−123 *−1.27−57.9−88.4
ANOVA
F-value (model)6.79814.1717.745.06218.154.688
F-value (lack of fit)100.21.46944.6963.32177.1769.6
p-Value (model)0.0241 *0.0047 *0.0028 *0.0444 *0.0026 *0.0518
p-Value (lack of fit)0.0099 *0.4296 0.0220 *0.0156 *0.0056 *0.0013 *
R20.9240.9620.970.9010.970.894
Adjusted R20.7880.8940.9150.7230.9170.703
RMSE25.676.8359.931.781149.4286.7
CV35.3233.7627.831.3329.5138.56
DF (total)141414141414
* The values significantly affected responses at a probability level of 95% (p < 0.05). TPC, total polyphenol content; TFC, total flavonoid content; TAC, total anthocyanin content; AAC, ascorbic acid content; FRAP, ferric-reducing antioxidant power; DPPH, antiradical activity; ns, non-significant; F-value, test for comparing model variance with residual (error) variance; p-Value, probability of seeing the observed F-value if the null hypothesis is true; RMSE, root mean square error; CV, coefficient of variation; DF, degrees of freedom.
Table 6. Predicted maximum responses of the dependent variables and the corresponding optimum extraction conditions obtained through the UAE process.
Table 6. Predicted maximum responses of the dependent variables and the corresponding optimum extraction conditions obtained through the UAE process.
ParametersIndependent VariablesDesirabilityLeast Squares Regression
C (%)
(X1)
E (%)
(X2)
t (min)
(X3)
TPC (mg GAE/g dw)2872160.9201213.85 ± 35.21
FRAP (μmol AAE/g dw)3660240.99122536.63 ± 315.81
DPPH (μmol AAE/g dw)40100220.91861885.74 ± 526.97
TFC (mg RtE/g dw)3960200.899484.05 ± 11.67
TAC (μg CyE/g dw)3460150.99301020.6 ± 98.31
AAC (mg/g dw)084110.992015.26 ± 2.97
Table 7. Multivariate correlation analysis among the measured variables.
Table 7. Multivariate correlation analysis among the measured variables.
ResponsesTPCFRAPDPPHTFCTACAAC
TPC-0.79000.73920.76070.81660.6960
FRAP -0.94160.94440.89180.4622
DPPH -0.89410.89410.4742
TFC -0.91880.5837
TAC -0.6240
AAC -
Table 8. Maximum desirability values for all response variables as determined by the partial least squares (PLS) prediction profiler under optimal UAE conditions.
Table 8. Maximum desirability values for all response variables as determined by the partial least squares (PLS) prediction profiler under optimal UAE conditions.
ParametersIndependent VariablesDesirabilityPLS RegressionExperimental Values
C (%)
(X1)
E (%)
(X2)
t (min)
(X3)
TPC (mg GAE/g dw)3360150.8576210.94195.55 ± 4.11
FRAP (μmol AAE/g dw)2366.892627.78 ± 94.6
DPPH (μmol AAE/g dw)1755.171516.56 ± 78.86
TFC (mg RtE/g dw)83.4674.78 ± 3.59
TAC (μg CyE/g dw)1020.28992.87 ± 62.55
AAC (mg/g dw)11.3815.68 ± 0.93
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mantiniotou, M.; Athanasiadis, V.; Liakos, K.G.; Bozinou, E.; Lalas, S.I. Ultrasound-Assisted Extraction of Antioxidant Compounds from Pomegranate Peels and Simultaneous Machine Learning Optimization Study. Processes 2025, 13, 3700. https://doi.org/10.3390/pr13113700

AMA Style

Mantiniotou M, Athanasiadis V, Liakos KG, Bozinou E, Lalas SI. Ultrasound-Assisted Extraction of Antioxidant Compounds from Pomegranate Peels and Simultaneous Machine Learning Optimization Study. Processes. 2025; 13(11):3700. https://doi.org/10.3390/pr13113700

Chicago/Turabian Style

Mantiniotou, Martha, Vassilis Athanasiadis, Konstantinos G. Liakos, Eleni Bozinou, and Stavros I. Lalas. 2025. "Ultrasound-Assisted Extraction of Antioxidant Compounds from Pomegranate Peels and Simultaneous Machine Learning Optimization Study" Processes 13, no. 11: 3700. https://doi.org/10.3390/pr13113700

APA Style

Mantiniotou, M., Athanasiadis, V., Liakos, K. G., Bozinou, E., & Lalas, S. I. (2025). Ultrasound-Assisted Extraction of Antioxidant Compounds from Pomegranate Peels and Simultaneous Machine Learning Optimization Study. Processes, 13(11), 3700. https://doi.org/10.3390/pr13113700

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop