Next Article in Journal
Experimental Study on the Influence of Out-of-Plane Effects on In-Plane Performance of Composite Slabs
Previous Article in Journal
Intelligent Information Model for Pile Foundation Design: A Research Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete

College of Civil Engineering and Geomatics, Nanning Campus, Guilin University of Technology, Nanning 530001, China
*
Author to whom correspondence should be addressed.
Buildings 2026, 16(10), 1927; https://doi.org/10.3390/buildings16101927
Submission received: 10 April 2026 / Revised: 5 May 2026 / Accepted: 9 May 2026 / Published: 12 May 2026
(This article belongs to the Section Building Structures)

Abstract

Hybrid steel–PVA fiber-reinforced concrete offers promise for enhancing both load-bearing capacity and deformation capacity. However, the coupled effects of fiber parameters and volume-fraction combinations on compressive strength (σc) and peak strain (εc) are still not fully understood. A unified, interpretable, and engineering-oriented quantitative framework is still lacking. This study compiled experimental data from 26 published literature, building a multi-source database consisting of 397 datasets for σc and 203 datasets for εc. Based on this database, a comprehensive analytical framework was proposed, including model prediction, SHAP-based interpretation, Monte Carlo marginalization, synergy-gain window determination, and dual-objective mix-proportion optimization. For σc prediction, LightGBM achieved the highest test-set R2 (0.9783), whereas CatBoost showed more robust error control (MAE = 2.7409 MPa). CatBoost was therefore selected as the base model for the subsequent interpretation analysis. For εc prediction, Bayesian-optimized CatBoost achieved the best test performance (R2 = 0.9659, MAE = 0.0218, RMSE = 0.0358), while the transfer-learning model reached a comparable accuracy level (R2 = 0.9650). SHAP analysis revealed that σc is mainly governed by matrix mix-proportion factors and steel fiber volume fraction, whereas εc is more sensitive to S/B and PVA-related variables. The mean synergy-gain maps generated via Monte Carlo marginalization and two-dimensional grid evaluation further showed clear differences between the two targets. Positive synergy in σc was highly localized. Its maximum mean synergy gain was 4.7949 MPa at (Steel, PVA) = (1.875%, 2.000%). By contrast, εc exhibited a wider positive-synergy region, with a peak value of 0.0141629 at (0.38%, 1.62%). Therefore, the engineering output of this study is not a single optimal mix point. Instead, it is a set of candidate windows for different performance targets, together with boundary-risk identification and priorities for experimental validation.

1. Introduction

Steel and PVA fibers play complementary roles in toughening concrete. Steel fibers mainly improve post-cracking load resistance and energy absorption, whereas PVA fibers are more effective in controlling microcracks and enhancing ductility. When the two fibers are properly combined in terms of properties and dosage, they may work together to improve both strength and deformation capacity. However, hybrid fiber systems involve the interaction of many factors, such as matrix composition, fiber geometry, mechanical properties, and fiber content. Because of this complexity, conventional experiments alone are often not sufficient to reveal the underlying interaction patterns in a systematic way.
In recent years, experimental studies on the mechanical performance of hybrid steel–PVA fiber-reinforced concrete have steadily increased. Zhou et al. [1] systematically investigated its uniaxial compressive constitutive behavior through an orthogonal experimental design and confirmed that the combined use of steel and PVA fibers can significantly improve failure behavior and energy dissipation. Liu et al. [2] reported that the hybrid-fiber effect is jointly governed by matrix mix proportion and fiber volume fraction. Abbas et al. [3] developed a compressive stress–strain constitutive model for hybrid steel–PVA fiber-reinforced concrete and quantified the effects of fiber parameters on peak stress and peak strain. Wu et al. [4] further demonstrated, from the perspective of flexural behavior, the synergistic advantages of hybrid fibers in post-cracking toughness and deformation capacity. These studies provide an important basis for understanding the reinforcing mechanisms of hybrid fibers. However, because of the limited experimental scope and sample size, it is still difficult to fully capture the nonlinear interactions arising from multi-factor coupling.
With the wider use of data-driven methods in building materials research, machine learning has become a common tool for predicting concrete performance and exploring hidden patterns in the data. Kang et al. [5] showed that tree-based models can capture the nonlinear behavior of fiber-reinforced concrete with good accuracy. Al-Shamasneh et al. [6] reported that ensemble learning performs robustly in predicting the compressive strength of steel fiber-reinforced concrete. Sofos et al. [7] and Cui et al. [8] further demonstrated that machine learning is applicable to complex material–structure problems involving FRP-confined concrete and related members.
Even so, most existing studies still use machine learning mainly as a black-box predictor. Much less attention has been given to questions that matter more in engineering practice, such as whether synergistic enhancement really exists, where it appears in terms of fiber dosage, and whether such patterns are backed by sufficient data.
Although machine learning has shown strong predictive ability in concrete research, most existing studies still treat the model as a black-box predictor. They pay much less attention to engineering questions that are more practically relevant, such as whether synergistic enhancement exists, in which volume-fraction ranges it occurs, and whether the observed pattern is supported by sufficient data.
Against this limitation in existing studies, machine learning-driven multi-objective optimization has emerged as a key approach to overcoming the dimensional bottlenecks of experimental research. Zhang et al. [9,10,11] applied Pareto fronts and metaheuristic algorithms to examine trade-offs among strength, economic performance and other indicators in normal, silica-fume and recycled-aggregate concrete. However, these studies have yet to address the synergistic optimization of key mechanical properties in hybrid steel–PVA fiber-reinforced concrete.
For engineering design, it is essential to clarify how fiber reinforcement works. In this study, synergy gain is defined as the super-additive effect of hybrid steel–PVA fibers relative to single-fiber reinforcement. Thus, the key question is not only whether the prediction is accurate. It also includes whether synergistic fiber enhancement exists, in which volume-fraction ranges it appears, and under what data-support conditions it can be interpreted with reasonable confidence.
Therefore, this study does not treat machine learning simply as a black-box predictor. Instead, it combines machine learning with interpretable analysis, marginal-response modeling, and synergy-gain quantification to build a knowledge-extraction framework for engineering-oriented screening. Using compressive strength (σc) and peak strain (εc) as two target indicators of load-bearing capacity and deformation capacity, respectively, this study further proposes an overlay strategy for dual-objective synergy windows. This strategy provides support for preliminary mix screening and for setting priorities in experimental validation.
The main contributions of this study are as follows: (1) an interpretable machine learning framework was established for the dual objectives of σc and εc; (2) the mean synergy-gain surface of steel and PVA fibers was defined and quantified based on Monte Carlo marginalization; and (3) the synergy boundaries of σc and εc were identified, and a dual-objective mix-proportion screening logic was constructed.
Compared with most existing studies, which focus mainly on accuracy comparison, single-indicator interpretation, or single-performance prediction, the main novelty of this study lies in the identification of dual-objective synergy windows and in the design-oriented conclusions provided for material selection and mix-proportion optimization in engineering practice.

2. Materials and Methods

2.1. Dataset Construction and Definition of the Feature System

2.1.1. Data Sources and Sample Composition

The database used in this study was compiled from uniaxial compression test data on hybrid steel–PVA fiber-reinforced concrete collected from published literature and academic theses [1,3,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35]. After data cleaning, data standardization, and specimen-size normalization, the compressive-strength dataset (σc) contained 397 samples, whereas the peak-strain dataset (εc) contained 203 samples. The two datasets cover a range of conditions, including plain matrix mixtures, single-fiber mixtures, and hybrid steel–PVA fiber mixtures. This broad parameter coverage provides a sound basis for the subsequent modeling of nonlinear multi-factor relationships.
Table 1 summarizes the data sources and sample-count distribution of the core references for σc, so as to illustrate the source composition and sample coverage of the database. The εc samples were mainly collected from 12 studies, including Refs. [1,3,12,13,14,15,16,17,18,19,20,21], and are therefore not listed separately here.

2.1.2. Feature Classification, Coding, and Target Variables

The input variables included four categories: matrix mix-proportion parameters, mineral admixture and chemical admixture indicators, steel-fiber parameters, and PVA-fiber parameters. In addition to continuous variables, several binary indicator variables were introduced to distinguish the presence or absence of fly ash, silica fume, superplasticizer, and the two fiber types, thereby improving the model’s ability to represent the mixed feature space. The target variables were compressive strength, σc (MPa), and peak strain, εc (%). The definitions of all variables are given in Table 2.

2.2. Data Preprocessing, Internal-Validation Setting, and Analysis Boundaries

2.2.1. Data Cleaning, Size Normalization, and Statistical Characteristics

After extraction, the raw data collected from the literature were sequentially subjected to unit unification, outlier checking, duplicate verification, and missing-value screening. For compressive test results obtained from specimens of different shapes and sizes, predefined size-normalization rules were applied to convert them to a common reference basis. This procedure was used to reduce the influence of specimen-size differences on model training and synergy analysis. Table 3 presents the conversion coefficients used to normalize the mechanical properties of non-standard specimens to the standard size according to Eurocode 2 (BS EN 1992) [36].
In terms of statistical characteristics, the key variables in the database span a broad range (see Table 4). Most parameters fall within the ranges commonly used in engineering practice and exhibit reasonable variations in their values. This indicates that the database captures substantial differences in material performance under different mix proportions and fiber-parameter combinations. It also provides the necessary data basis for the subsequent identification of nonlinear effects and interactions.
In this study, data complexity was controlled through preprocessing procedures, including feature selection and input-variable standardization. These procedures removed redundant features and unified variable scales, thereby reducing the Kolmogorov complexity of the dataset [38,39]. This provided an efficient and stable data basis for the subsequent model training and synergy-effect analysis.

2.2.2. Dataset Splitting and Validation Strategy

To ensure that the modeling procedure was reproducible and that the results were reliable, the compiled dataset was randomly split into training and test sets at a ratio of 8:2. A Kolmogorov–Smirnov (KS) test was then performed to check whether the two sets remained consistent in terms of the target-variable distribution. This helped confirm that the data split was reasonable. Figure 1 compares the CDFs of σc for the training and test sets and reports the corresponding KS test results.
Figure 1 shows only a small difference in the σc distribution between the training and test sets, which satisfies the requirement of distributional balance for the subsequent internal validation and interpretation analysis. The εc dataset was split using the same stratified sampling strategy as that used for σc. Its KS statistic and distribution-consistency indicators were at the same confidence level, indicating a high degree of distributional agreement between the training and test sets for both target variables. Therefore, the distribution-validation figure for εc is not presented separately.

2.2.3. Model Training and Synergy-Gain Calculation Framework

For the σc task, multiple regression models were compared, and the model with the best overall performance was selected as the base model for the subsequent interpretation analysis. For the εc task, transfer learning and hyperparameter optimization were introduced to improve modeling stability because of the limited sample size. The optimal models were then further analyzed through global SHAP importance and SHAP dependence plots.
To quantify the synergy boundary of the two fiber volume fractions, this study calculated the synergy gain Δ(s, p) on a predefined 17 × 17 discrete grid. Here, s denotes the steel-fiber volume fraction, and p denotes the PVA-fiber volume fraction. The boundary was identified using a fixed procedure. First, the Δ(s, p) = 0 contour was used as the critical boundary of synergy. All grid points with Δ(s, p) > 0 were defined as the positive-synergy window. Its coverage was calculated as the proportion of positive-synergy points among all grid points. Then, a convex hull, denoted as H, was constructed from the measured sample coordinates of (s, p). The intersection between the positive-synergy window and H was taken as the data-supported candidate mix-proportion range. If the shortest distance from a candidate point to the boundary of H was smaller than the predefined grid step of 0.125%, this point was classified as a boundary-risk point rather than a robust recommended region. By combining standardized algorithmic steps with the measured data distribution, this procedure determined the data-supported robust boundary for the steel–PVA fiber mix proportions.
To build synergy maps that can support engineering screening, this study used a Monte Carlo marginalization strategy. The procedure was implemented as follows. The analysis focused on the two core variables, namely the steel-fiber and PVA-fiber volume fractions. The remaining mix-proportion and material parameters were randomly sampled, and the model outputs were then averaged. This reduced the influence of non-core variables and yielded the marginal response of the two core variables. For a given steel-fiber volume fraction s and PVA-fiber volume fraction p, the other input variables were randomly sampled, and the model outputs were averaged to obtain the marginal mean response. The synergy gain was defined as Δ(s,p) = f(s,p) − f(s,0) − f(0,p) + f(0,0), and its marginal mean was written as Δ ¯ (s,p). When Δ ¯ (s,p) > 0, positive synergy is considered to exist. (The stability of the model basis used to identify the synergy-gain window was further examined using 10 repeated random splits. The coefficient of variation of R2 was only 1.01%; see Section 3.2.3 and Appendix A.1.)
To quantify this synergy gain, the steel-fiber volume fraction s and the PVA-fiber volume fraction p were fixed first. For each fixed (s, p) combination, B samples were independently drawn from the empirical distributions of the remaining mix-proportion and material parameters. The trained machine learning model was then used to predict the target performance for each sampled case. The arithmetic mean of these predictions was taken as the model output for the corresponding (s, p) combination.
In the dual-objective screening, the compressive strength and peak strain of a specimen are denoted as σc(s,p) and εc(s,p), respectively. The optimization problem is therefore formulated as maximizing the objective vector [σc(s,p), εc(s,p)]. Under this maximization setting, candidate i is regarded as Pareto-optimal if no other mix-proportion combination j satisfies both σc(sj,pj) ≥ σc(si,pi) and εc(sj,pj) ≥ εc(si,pi). To avoid relying on a single extreme solution, this study further identifies strength-oriented, ductility-oriented, and compromise candidates along the Pareto front. Their relative stability is then examined through test-set re-evaluation.
Because statistical interpretation cannot be directly equated with material mechanisms, the SHAP results were further checked using three engineering-consistent criteria, with empirical support from Appendix A.5 and Appendix A.8. First, the influence directions of the key variables should be consistent with the basic mechanical behavior of fiber-reinforced concrete. Second, the SHAP dependence regions should be supported by sufficient sample density and controlled discrete-level distributions. Third, in the two σc and εc tasks, the dominant roles of steel and PVA fibers should match the engineering mechanisms of load-bearing capacity and deformation capacity, respectively.

2.2.4. Five-Stage Framework and Technical Roadmap for the Quantitative Identification of Fiber Synergy

This study adopted a data-driven five-stage analytical framework to quantitatively identify fiber synergy. The overall workflow is shown in Figure 2.
The framework begins by integrating and cleaning 397/203 literature-based datasets to build a high-quality sample database. It then combines CatBoost modeling with SHAP-based interpretation to achieve accurate and interpretable prediction. Finally, by linking Monte Carlo marginalization with synergy-window overlay, it makes it possible to identify and visualize fiber synergy in a quantitative way. In this way, the framework offers an interpretable route for optimizing the mix design of fiber-reinforced concrete.

3. Results and Analysis

3.1. Model Performance Comparison and Base-Model Selection

3.1.1. Comparison of σc Models and Selection of the Base Model

Table 5 shows that, for the σc task, tree-based models performed markedly better than the linear model overall. This suggests that the relationship between compressive strength and the multidimensional input variables in hybrid steel–PVA fiber-reinforced concrete is strongly nonlinear. In the internal-validation results, LightGBM achieved the highest test-set R2 (0.9783), whereas CatBoost obtained the lowest MAE (2.7409 MPa). CatBoost also showed better robustness under limited-sample conditions and was more suitable for handling categorical features and supporting the subsequent interpretation analysis. (See Table A2 in Appendix A.1 for details.)
Because the goal of the subsequent analysis is not simply to identify the model with the highest test score, but to support SHAP interpretation, single-variable main-effect analysis, and two-dimensional synergy-gain calculation, moreover, CatBoost is naturally suited to handling categorical features, which makes it more compatible with the mixed feature structure of the dataset used in this study. CatBoost was chosen as the base model for the σc analysis. This choice preserves strong predictive performance while avoiding over-reliance on a single evaluation metric in model selection. To test its robustness, the dataset was subjected to 10 repeated random train–test splits. The results show that the performance variation of CatBoost remained within an acceptable range. Detailed metrics and variation analysis are given in Appendix A.1.

3.1.2. εc Model Development and Small-Sample Modeling Strategy

Compared with σc, the εc dataset is smaller and is therefore more sensitive to parameter selection and variations in data splitting. Based on this characteristic, three modeling strategies were compared in this study: baseline CatBoost, a transfer-learning model, and Bayesian-optimized CatBoost (see Table 6). The results show that all three methods achieved good predictive performance. Among them, Bayesian-optimized CatBoost performed best overall, while the transfer-learning model reached a comparable level of accuracy. The parameter settings of the transfer-learning model(Table A3) and the hyperparameter-optimization results (Table A4) are presented in Appendix A.2 and Appendix A.3.
This result suggests that cross-task transfer can be practically useful when the sample sizes of different performance indicators are unbalanced. Even so, the later analysis of εc, including its interpretation and synergy-map construction, is still based mainly on the Bayesian-optimized CatBoost model. The transfer-learning results are treated as supportive evidence, rather than as the sole basis for the core conclusions.

3.2. Analysis of the σc Model Results and Synergy Mechanisms

3.2.1. Feature-Importance Ranking and Strength-Control Variables for the σc Model

Figure 3 presents the global feature importance ranking for the σc model. The results show that compressive strength is primarily driven by matrix mix proportions and fiber volume fractions. Among all variables, the W/B and V_STF contribute most significantly, indicating that matrix densification and steel fiber content are the key factors governing compressive performance. The importance rankings are broadly consistent between the training and test sets, suggesting that the main findings are relatively stable across the dataset. Detailed information on feature fluctuations and validation results is provided in Appendix A.4.
Figure 4 further focuses on fiber-related variables. In addition to volume fraction, steel fiber properties including tensile strength and length, and PVA-related geometric parameters also rank relatively high in importance, although they generally appear after the dominant matrix-related factors and fiber volume fractions. This result suggests that the effect of hybrid steel–PVA fibers on compressive strength is governed first by fiber addition and its volume fraction, whereas geometric and mechanical parameters mainly exert a secondary regulating effect within specific ranges (see Appendix A.5).

3.2.2. SHAP Dependence Plots and Single-Fiber Main-Effect Curves

Figure 5 further reveals the nonlinear influence patterns of key variables on σc. Generally, increasing the steel fiber volume fraction leads to greater positive contributions to σc, though this enhancement exhibits diminishing marginal returns at higher volume fractions. Meanwhile, PVA fiber-related variables exhibit more erratic contributions to σc at low volume fractions, with their effects gradually stabilizing after the intermediate volume-fraction range. This indicates that fiber reinforcement does not operate at a constant efficiency; instead, its effectiveness is collectively governed by the uniformity of fiber dispersion, the quality of fiber–matrix interfacial bonding, and the compatibility between the fiber and matrix materials.
From the perspective of physical mechanisms, this trend is consistent with the role of steel fibers in the later stage of compression. They help restrain lateral deformation, bridge macrocracks, and limit crack propagation. The diminishing marginal gain at higher fiber contents may be attributed to reduced workability, fiber agglomeration, and more defects in the interfacial transition zone, which can offset the reinforcing effect. Therefore, the V_STF- and V_PVA-related SHAP patterns are interpreted in this study as statistical evidence consistent with the combined action of crack bridging and microcrack restraint. They should not be regarded as standalone causal proof. The empirical support for these SHAP-dependence regions is summarized in Table A7.
The single-fiber main-effect curves in Figure 6 show an overall trend consistent with the SHAP dependence plots. Steel-fiber addition alone is more likely to improve compressive strength, whereas the gain in σc from PVA-fiber addition alone is relatively limited. When the two fiber types coexist, some volume-fraction combinations show a more pronounced enhancement trend than single-fiber addition. However, this enhancement does not hold across the entire volume-fraction domain. In this study, B = 100 was adopted for the visualization and quantitative analysis of the synergy-gain surface, and its convergence analysis is provided in Appendix A.6.

3.2.3. Mean Synergy-Gain Surface, Synergy Boundary, and Data-Support-Domain Constraint for σc

To provide a direct view of the synergistic interaction between the steel-fiber and PVA-fiber volume fractions, a bivariate partial dependence plot (Figure 7) was constructed based on the standard analytical framework of partial dependence plots [40]. Building on this, the boundary and coverage domain of the synergy-gain window were delineated.
Figure 7 provides a clear view of the synergistic response of compressive strength to the interaction between the steel-fiber and PVA-fiber volume fractions. The strongest response appears when the steel-fiber volume fraction is within 1.5–2.0% and the PVA-fiber volume fraction is within 1.8–2.0%. In this region, the predicted strength gradient is continuous and relatively stable. This result supports the robustness of the identified synergy-gain window and the rationality of the feature range covered by the dataset.
Based on the synergy-gain calculation framework and the Monte Carlo marginalization strategy established in Section 2.2.3, this section provides a quantitative analysis of the synergy boundary for compressive strength. Figure 8 shows the mean synergy-gain surface of σc obtained from a 17 × 17 two-dimensional grid and Monte Carlo marginalization. The results indicate that the positive synergy between steel fibers and PVA fibers in compressive strength is clearly localized, rather than being universally present across the entire steel–PVA volume-fraction plane. The positive-synergy region is mainly distributed near combinations with high steel-fiber and high PVA-fiber contents. This suggests that, at relatively high volume fractions, the two fiber types may enhance compressive performance through the combined effects of macro-scale bridging and microcrack restraint.
Figure 9 shows that the maximum mean synergy gain for σc over the whole domain is 4.794912 MPa, corresponding to (Steel, PVA) = (1.875%, 2.000%). However, the positive-synergy region occupies only about 1.7% of the domain. This result suggests that hybridization does not necessarily lead to a strength benefit. Its super-additive effect appears only within a limited range of fiber combinations.
By combining the optimal-gain results from Table 7 with the sample distributions in Figure 9, we find the maximum mean synergy gain of σc lies near the boundary of the sample-supported domain. (Its repeatability was assessed using 10 repeated random splits, as reported in Appendix A.1.) This not only points to the high research value of high-steel-fiber and high-PVA-fiber combinations, but also reminds us to interpret these findings carefully in real engineering applications—we must pay close attention to workability, fiber dispersion, and the need for further experimental validation.
From an engineering perspective, the high-steel-fiber/high-PVA-fiber region is better treated as a potential testing zone for high-strength mixes, rather than a fixed point ready for direct practical recommendation. Priority should be given to combinations that fall within the positive-synergy region, remain reasonably far from the convex-hull boundary, and reside within the data-support domain.
To validate the reliability and robustness of this synergy-gain surface, we conducted grid resolution tests and robustness checks, which confirmed that fluctuations were kept within 2% and the core morphological features showed statistical consistency. The full validation process is detailed in Appendix A.7.

3.3. Analysis of the εc Model Results and Synergy Mechanisms

3.3.1. Global Feature Importance and Ranking of Fiber-Related Variables

As shown in Figure 10 and Figure 11, compared with the σc task, the importance ranking of the εc model exhibits a more differentiated pattern. In addition to some matrix mix-proportion parameters, PVA-related variables become markedly more important in the εc task, with S/B, V_PVA, D_PVA, and f_PVA usually ranking among the more influential features. This indicates that peak strain is more sensitive to the microcrack-control capacity of PVA fibers, rather than being governed solely by the macro-scale bridging effect of steel fibers.
The importance distributions in the training and test sets are broadly consistent, which supports the interpretation analysis of εc within the current data-support domain. However, because the εc sample size is relatively small, the ranking results should emphasize the relative positions of the main controlling factors, rather than over-interpreting subtle differences between adjacent variables (see Appendix A.8).

3.3.2. SHAP Dependence Plots, Discrete-Level Support, and the εc Synergy-Gain Map

This section adopts the same definition of the synergy-gain window, quantification formula, and stability-assessment framework as those used for the compressive-strength analysis in Section 3.2.3. The detailed logic and supporting data are provided in Section 3.2.3 and Appendix A.1.
The SHAP dependence plots of the key variables for εc (Figure 12) show that the volume fraction of PVA fibers and related parameters make a more pronounced positive contribution to peak strain, whereas steel fibers mainly play a supporting bridging role during the later stages of crack development. This result is consistent with the underlying material mechanism. PVA fibers are more effective in suppressing the initiation and propagation of microcracks, and are therefore more critical to deformation capacity near the peak point. By contrast, steel fibers improve bridging capacity at the macrocrack stage and thus provide a complementary contribution to ductility enhancement. In addition, PVA fibers may also restrain microcracks during curing shrinkage. When PVA fibers deform compatibly with the early-age matrix, they can share local shrinkage-induced tensile stress and bridge initial microcracks. This may reduce the risk of shrinkage-crack formation and propagation, thereby indirectly improving the peak strain of the material.
This explanation is also consistent with the scale-dependent roles of hybrid fibers. PVA fibers have smaller diameters and a higher number density. They are therefore more effective in controlling microcracks and improving deformation compatibility near the peak point. Steel fibers have higher stiffness and strength, making them more suitable for bridging and restraining larger cracks. Thus, the increased importance of PVA-related variables in the εc model is not merely a statistical pattern. It is also consistent with the physical origin of peak strain.
It should be noted that some variables in the εc dataset have relatively few discrete levels and high shares of dominant levels (see Table 8). This may cause the SHAP dependence plots to exhibit step-like or locally fluctuating patterns in certain intervals. Therefore, when interpreting the key variables, this study considers the number of discrete levels, the shares of dominant levels, and the distribution of tail samples together, so as to enhance interpretive transparency. (See Table A10 for empirical support details)
Based on the above single-variable SHAP dependence analysis, the mean synergy-gain map constructed for εc shows that its positive-synergy region is clearly wider than that of σc. The peak is mainly located near combinations with low-to-moderate steel-fiber content and moderate-to-high PVA-fiber content. Quantitative results show that the global maximum mean synergy gain of εc is 0.0141629, located at (Steel, PVA) ≈ (0.38%, 1.62%), and that the positive-synergy region covers about 18% of the whole domain.
Combined with the window distribution, this suggests that ductility synergy is more likely to arise from the coordinated action of a moderate amount of steel fibers and a relatively high amount of PVA fibers at different stages of crack development. Compared with the σc window, the εc window is more suitable as a basis for ductility-oriented design (see Figure 13 and Table 9).

3.4. Implications of Dual-Objective Synergy: Trade-Offs Between Strength and Ductility and Mix-Proportion Boundaries

As shown in Figure 14, a comparison of the synergy windows of σc and εc on the steel–PVA volume-fraction plane indicates that their positive-synergy regions do not fully overlap. This means that engineering design does not have a single optimal point that simultaneously satisfies all objectives. A more reasonable approach is to screen candidate regions according to performance constraints and the level of data support.
From an engineering perspective, if the primary objective is to improve load-bearing capacity, the candidate mixes are more likely to fall in the high-fiber-content region. However, greater attention should also be paid to reduced workability, difficulties in fiber dispersion, and the risk of boundary extrapolation. If ductility is the main concern, screening can instead focus on combinations with low-to-moderate steel-fiber content and moderate-to-high PVA-fiber content. The value of the overlay map of dual-objective synergy windows lies not in replacing experiments, but in providing a data-supported quantitative basis for preliminary mix screening and for setting priorities in experimental validation.
Based on the above differences, preliminary engineering mix screening can be carried out in three steps. First, determine whether the primary objective is load-bearing capacity, ductility, or a balance between the two. Second, screen candidate ranges within the corresponding positive-synergy window. Third, exclude combinations located near the boundary of the data-support domain, and subject the remaining combinations to experimental verification.
To further show the trade-off between the two objectives, Figure 15 presents the training-set Pareto front obtained from the candidate mix-proportion points, together with the corresponding test-set re-evaluation results. As the solution moves from left to right along the Pareto front, the increase in σc is accompanied by a decrease in εc. This indicates a clear trade-off between strength and peak strain. The Max εc candidate, Max σc candidate, and balanced candidate represent three types of candidate schemes: ductility-oriented, strength-oriented, and dual-objective compromise schemes, respectively. The test-set re-evaluation curve is generally lower than the training-set Pareto front. This suggests that the Pareto results are more suitable for ranking candidate mix proportions and for setting experimental priorities. They should not be directly treated as final engineering recommendations.

4. Discussion

4.1. Model Performance and Analytical Positioning

As shown in Section 3, the random cross-validation results based on the available data indicate that both the σc and εc tasks achieved high fitting and predictive accuracy. This suggests that the data-driven models established from mix-proportion parameters, fiber geometric parameters, and fiber mechanical parameters can effectively capture the main nonlinear relationships in the compressive behavior of hybrid steel–PVA fiber-reinforced concrete. More importantly, the core value of this study lies not only in its high predictive accuracy, but also in transforming the prediction results into interpretable, screenable, and practically useful information that can directly support engineering applications, mix-proportion optimization, and experimental design.
Building on the analytical framework this model provides, we will further investigate how steel–PVA hybrid fibers modulate the compressive performance of concrete. To improve sample utilization under small-sample conditions, all 397 samples were combined for the final visualization analysis. The stability assessment of the synergy-gain window is provided in Appendix A.1.

4.2. Interpretability of Key Variables and the Differentiated Roles of Steel and PVA Fibers

SHAP analysis shows that although σc and εc are both indicators of compressive behavior, they are governed by different dominant factors. In the σc task, matrix mix-proportion factors and steel fiber volume fraction are more important. In the εc task, S/B and several PVA-related variables carry greater influence. This difference indicates that the load-bearing capacity and deformation capacity of hybrid fiber-reinforced concrete are not controlled by the same set of variables in the same manner. The former depends more on the load-bearing skeleton of the matrix and the bridging capacity across macrocracks. The latter is more sensitive to microcrack control, fiber–matrix interfacial interaction, and the effect of matrix volumetric proportioning on deformation compatibility.
From the perspective of material mechanisms, the higher elastic modulus and tensile strength of steel fibers make them more effective in post-cracking bridging and in delaying the propagation of macrocracks. This is why they exert a more direct strengthening effect in the σc model. By contrast, PVA fibers are more advantageous in suppressing microcrack initiation, improving the continuity of crack propagation, and enhancing deformation accommodation near the peak point. This is broadly consistent with the findings of previous experimental studies [1,2,3,4]. In other words, steel fibers and PVA fibers do not merely offer redundant reinforcement. Instead, they participate in the compressive failure process at different scales. This distinction forms the basic physical basis of hybridization, rather than simple superposition. Furthermore, the role of PVA fibers should not be understood only as crack bridging during loading. It also includes early restraint of shrinkage-induced microcracks during curing. Such control of initial defects may provide an important basis for the later increase in peak strain and the delayed propagation of cracks.
Furthermore, both the SHAP dependence plots and the single-fiber main-effect curves indicate that fiber effects are strongly nonlinear. As the volume fraction increases, the strengthening effect does not continue to grow at a constant rate. Instead, it often shows diminishing marginal returns, plateauing, or even local fluctuations. This suggests that, in a multi-source literature-based dataset, the potential performance gains associated with higher fiber content may be simultaneously limited by factors such as fiber dispersion, workability, interfacial bonding, and matrix compatibility. Therefore, fiber optimization cannot be achieved simply by increasing fiber dosage. More importantly, it calls for pinpointing the parameter ranges in which the positive effects of fibers can be stably observed under the support of the available data.
Therefore, the physical validity of the interpretable results was assessed using a cautious “mechanistic consistency + data support” strategy. A SHAP trend was used as a basis for engineering interpretation only when three conditions were met: it was consistent with material mechanisms, it was supported by sufficient sample density or discrete-level coverage, and it showed a similar direction in the single-fiber main-effect curves and synergy-gain maps. Local interactions with sparse samples or those close to the data boundary were treated as optimization directions that require further engineering validation.

4.3. Engineering Implications of Synergy-Gain Windows and Dual-Objective Trade-Offs

From the synergy-gain heatmaps generated via Monte Carlo marginalization and two-dimensional grid evaluation, we observe that the synergistic enhancement between steel fibers and PVA fibers does not hold across the entire volume-fraction domain, but instead shows clear regional characteristics. This insight carries important implications for engineering practice: hybrid fiber mixtures do not inherently outperform single-fiber systems or produce simple additive effects, and only specific fiber combinations can yield true super-additive benefits. Therefore, rather than adhering to the empirical assumption that combining steel and PVA fibers will necessarily improve performance, this study advocates for a window-based and condition-dependent design strategy.
For σc, the positive-synergy region is concentrated near combinations with high steel-fiber and high PVA-fiber contents, and its area share is very small, indicating that strength synergy is strongly localized. This means that if the primary engineering objective is to achieve higher load-bearing capacity, the formulations under consideration are likely to fall in the high-volume-fraction region. However, such regions are also often closer to the data boundary and more likely to be accompanied by reduced constructability, difficulties in fiber dispersion, and increased construction risk. Therefore, the interpretation of the strength-synergy window must consider both potential benefits and application risks, instead of fixating solely on the peak value. In particular, the high-dosage synergy region close to the data boundary should be regarded as a high-potential direction for further experiments at this stage. Targeted validation tests are still needed before engineering application.
In comparison, the positive-synergy region for εc is wider, and its peak is located near combinations with low-to-moderate steel-fiber content and moderate-to-high PVA-fiber content. This indicates that ductility synergy does not rely on an extremely high steel-fiber volume fraction. Instead, it is more likely to arise from the coordinated action of a moderate amount of steel fibers and a substantial volume of PVA fibers at different stages of crack development: the former provides the necessary macro-scale bridging capacity, whereas the latter improves microcrack control and deformation compatibility. This finding suggests that, in scenarios where ductility, energy dissipation, or peak-strain enhancement is the primary objective, a moderate rather than extreme steel-fiber content is prone to deliver stable benefits.
More importantly, the synergy windows of σc and εc do not fully overlap, which means that engineering design must inevitably address a trade-off between strength and ductility. A more reasonable strategy is to first define the minimum requirements for load-bearing capacity and deformation capacity according to the structural objective. Candidate points should then be prioritized within the dual-objective synergy region, while also remaining inside the data-support domain and relatively far from the convex-hull boundary. Combinations located near the boundary should be confirmed through additional experiments. In this way, the role of the synergy-window map is not to replace experiments, but to help guide them in a more targeted and efficient manner.
Therefore, this study recommends window-guided selection rather than point-based selection. When a candidate combination lies well within the synergy region and remains distant from the convex-hull boundary, it should be prioritized in laboratory mixing trials. By contrast, if a combination exhibits favorable mechanical performance only in terms of its peak value but lies close to the boundary, it is an oriented case rather than being directly recommended as a target mix proportion.
Therefore, under the screening logic combining the Pareto front and the synergy window, combinations located within the synergy region and far from the convex-hull boundary should be prioritized as starting points for trial mixing in engineering applications. If a combination is favorable only near the boundary, then it should be treated as a validation target rather than as a directly recommended engineering mix without further testing.

4.4. Scope of Applicability, Limitations, and Future Work

Although this study established a relatively complete integrated workflow for prediction, interpretation, and synergy identification, its scope of applicability still needs to be clearly defined. First, the data were compiled from multiple published studies and academic theses. Although data standardization and specimen-size normalization were performed, differences in metadata may still exist across studies, including raw-material sources, curing conditions, loading rates, specimen preparation procedures, and testing equipment. These factors were not fully structured and incorporated into the model. Therefore, the patterns learned by the model should be understood, to some extent, as empirical regularities averaged over a multi-source database, rather than as an exact reproduction of a single experimental system.
Second, to ensure the reliability of the conclusions, all findings in this study are restricted to the current data-support range, and the scope of applicability will be further expanded through targeted experiments. At the same time, convex-hull coverage was used to identify high-confidence regions within the supported data domain. The conclusions on synergy effects in these regions can directly inform engineering design. By contrast, conclusions for samples near the convex-hull boundary should be regarded only as a basis for preliminary design and still require further experimental validation. Fiber-mix schemes located in such high-confidence regions can be directly applied in the production of concrete members and may help reduce trial-mix costs to some extent.
To further extend the current application boundary of this study, future work will proceed in three directions. First, additional experiments will be carried out in boundary regions and sparse mix-proportion intervals where data coverage is insufficient. The focus will be on verifying the mechanical stability of systems with high steel-fiber and high PVA-fiber contents, so as to provide more refined mix guidance for the production of concrete members. Second, mix-validation experiments using different batches of raw materials will be conducted to clarify the extent to which raw-material variability affects mix performance, thereby providing a quantitative basis for material substitution in engineering practice. Third, long-term performance data under different curing regimes (such as the evolution of shrinkage-induced stress and cracking) will be added to establish performance-prediction models that better reflect field conditions and further enhance the engineering applicability of the present study. In addition, this pre-peak strain energy U c is an important indicator for evaluating the seismic energy-dissipation potential of fiber-reinforced concrete. Previous studies have shown [41] that U c can be approximately estimated using empirical relationships related to σ c and ϵ c . Therefore, extending the prediction–interpretation–synergy-identification framework developed in this study to a three-objective analysis framework for σ c , ϵ c , and U c would further improve the engineering applicability of the present findings.
From the perspective of engineering implementation, the most effective path for future improvement is not to develop more complex prediction models. Instead, it is to improve the transferability and practical usefulness of the conclusions in concrete-member production by supplementing validation experiments for boundary mix proportions and by introducing multidimensional constraints such as workability, durability, and cost.

5. Conclusions

Based on the multi-source experimental database, interpretable machine learning, and synergy-gain map analysis, this study systematically investigated the compressive strength and peak strain of hybrid steel–PVA fiber-reinforced concrete. The main conclusions are as follows:
(1)
This study developed an interpretable machine learning framework to analyze the compressive strength (σc) and peak strain (εc) of hybrid steel–PVA fiber-reinforced concrete. The framework integrates performance prediction, mechanistic interpretation, and synergy-window identification into a unified analytical workflow, thereby providing a data-driven, interpretable technical approach for optimizing the mix design of such fiber-reinforced concretes.
(2)
Using a random train–test split on our multi-source dataset, tree-based models consistently outperformed linear models. For the compressive-strength prediction task, LightGBM achieved the highest R-squared at 0.9783, while CatBoost delivered the lowest mean absolute error of 2.7409 MPa. After comprehensively evaluating error control, prediction stability, and post-hoc interpretability, we selected CatBoost as the foundational model for subsequent compressive-strength analysis.
(3)
For the εc task, Bayesian-optimized CatBoost achieved the best test performance (R2 = 0.9659, MAE = 0.0218, RMSE = 0.0358). The transfer-learning model reached a comparable accuracy level (R2 = 0.9650), indicating that cross-task feature transfer can provide effective prior support for modeling performance indicators with limited sample sizes.
(4)
SHAP analysis showed that σc is mainly governed by matrix mix-proportion factors and steel fiber volume fraction, whereas εc is more sensitive to S/B and PVA-related variables. This difference reflects the distinct fiber-action mechanisms underlying load-bearing capacity and deformation capacity.
(5)
The mean synergy-gain maps derived from Monte Carlo marginalization show that the positive-synergy region for σc is strongly localized and mainly concentrated near combinations with high steel-fiber and high PVA-fiber contents, with a global maximum mean synergy gain of 4.794912 MPa. By contrast, the positive-synergy region for εc is wider and is mainly distributed in the range of low-to-moderate steel-fiber and moderate-to-high PVA-fiber combinations, with a peak value of 0.0141629. These results indicate that the effects of the two fiber types are not simply linearly additive, but show clear regionality and target dependence.
(6)
The dual-objective synergy windows of σc and εc do not fully overlap. Therefore, engineering mix design is better guided by a hierarchical screening logic of performance target–synergy window–data-support domain. The core value of this study lies in providing an interpretable and visual quantitative tool for candidate-mix screening and experimental-priority setting within the current data-support range, rather than directly offering a single universally applicable mix proportion. In particular, high-potential synergy regions close to the data-support boundary should be further validated before engineering application.
(7)
From a practical engineering perspective, this study is better regarded as a candidate-window map plus validation-priority tool, rather than as a single-point mix recommender. Priority should be given to combinations located within the positive-synergy region and relatively far from the boundary of the data-support domain, so as to improve the reliability of trial mixing and validation.

Author Contributions

Conceptualization, M.L.; Methodology, M.L.; Software, M.L.; Validation, J.C.; Formal analysis, J.C.; Investigation, S.Z.; Resources, S.Z.; Data curation, J.C.; Writing—original draft, M.L.; Writing—review and editing, S.Z.; Visualization, J.C.; Supervision, M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Guilin University of Technology, Nanning Branch, grant number [2024] No. 22.

Data Availability Statement

The data presented in this study are available in [Web of Science and CNKI] at [1].

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Model Stability and Synergy-Window Reproducibility

Appendix A.1.1. Performance Variation Under Repeated Random Splits

As shown in Table A1, the results of 10 repeated random splits indicate that the CatBoost model exhibits good stability. The σc model is more stable, whereas the εc model is more sensitive to random splitting. However, the overall conclusions remain consistent. The variation trajectories of the performance metrics under different random splits, together with their 95% confidence intervals, are presented in Figure A1.
Table A1. Performance stability of CatBoost models under repeated random train–test splits.
Table A1. Performance stability of CatBoost models under repeated random train–test splits.
PropertyMetricMeanSDCV (%)
Compressive strength (σc)R20.95720.00971.01
Compressive strength (σc)MAE3.31360.28048.46
Compressive strength (σc)RMSE4.81060.716714.90
Peak strain (εc)R20.91240.03143.44
Peak strain (εc)MAE0.04210.009221.73
Peak strain (εc)RMSE0.06980.016223.17
Note: 1. Mean, standard deviation (SD), and coefficient of variation (CV) of R2, MAE, and RMSE across 10 repeated 8:2 random splits. 2. Abbreviations: R2, coefficient of determination; MAE, mean absolute error; RMSE, root mean square error; SD, standard deviation; CV, coefficient of variation.
Table A2. Performance stability comparison between CatBoost and LightGBM models for compressive-strength prediction.
Table A2. Performance stability comparison between CatBoost and LightGBM models for compressive-strength prediction.
ModelCV of R2 (%)
LightGBM1.03
CatBoost1.01
Note: Both models were trained and validated using the same dataset and the same splitting strategy.
Figure A1. Performance variation and confidence intervals of the peak-strain and compressive-strength models under 10 repeated random splits.
Figure A1. Performance variation and confidence intervals of the peak-strain and compressive-strength models under 10 repeated random splits.
Buildings 16 01927 g0a1

Appendix A.1.2. Logic for Identifying the Synergy-Gain Window

Based on the model stability shown in Figure A1, the core logic for identifying the synergy-gain window is as follows:
  • Load the trained CatBoost-based prediction models for compressive strength and peak strain.
  • Traverse all combinations of steel-fiber volume fraction s ∈ [0,2%] and PVA-fiber volume fraction p ∈ [0,1.5%]. For each combination, calculate the hybrid-fiber response f(s,p), the single-fiber responses f(s,0) and f(0,p), and the fiber-free baseline response f(0,0).
  • Substitute these responses into the synergy-effect formula, Δ(s,p) = f(s,p) − f(s,0) − f(0,p) + f(0,0), and retain the combinations with Δ(s,p) > 0.
  • Apply density-based clustering to the retained combinations to determine a boundary interval of the synergy-gain window.

Appendix A.2. Parameter Settings of the Transfer-Learning Model

Table A3. Parameter settings of the transfer-learning model for εc prediction.
Table A3. Parameter settings of the transfer-learning model for εc prediction.
Parameter CategoryParameterValue
Pre-trained model parametersNumber of fixed layers726
Input feature dimension20
Leaf-feature dimension726
Source of pre-trained weightsNone
Training configurationRegularization coefficient (alpha)0.001
Batch sizeFull-batch training
Maximum iterations (max_iter)20,000
Validation strategyEarly stoppingNot applicable
Validation split ratio0.2

Appendix A.3. Hyperparameter Settings and Optimization Results

Table A4. Search space and optimal hyperparameters of the Bayesian-optimized CatBoost model.
Table A4. Search space and optimal hyperparameters of the Bayesian-optimized CatBoost model.
HyperparameterSearch RangeOptimal Value
learning_rate(0.01, 0.05)0.0452
depth(4, 6)6
iterations(3000, 4500),4018
l2_leaf_reg(10, 20),15
min_data_in_leaf(10, 16),14
random_strength(0.2, 0.6),0.5616
subsample(0.8, 1),0.9955
colsample_bylevel(0.7, 0.9),0.898
Note: This table reports the final hyperparameter combination selected for the peak-strain task using five-fold cross-validation, while taking advantage of CatBoost’s suitability for handling categorical features. This information supports the reproducibility of model training.

Appendix A.4. Stability of SHAP Importance Rankings

To examine the sensitivity of feature-interpretation results to random data splitting, this study repeated the 8:2 data split, model training, and SHAP ranking analysis under 10 different random seeds. The results show that, in both tasks, the rankings of the main features remain generally stable, and the top-ranked features exhibit only small fluctuations. The Kendall coefficient of concordance further indicates a high degree of consistency in the feature importance rankings across the 10 repeated experiments, suggesting that the corresponding interpretation results are robust (see Table A5 and Table A6, Figure A2 and Figure A3).
Table A5. Feature importance rank consistency analysis across 10 repeated random splits (σc).
Table A5. Feature importance rank consistency analysis across 10 repeated random splits (σc).
FeatureMean RankSD RankBest RankWorst RankMean |SHAP|
W/B1.000.001.01.06.2785
SP2.100.322.03.03.4681
V_STF2.900.322.03.02.9420
SF4.200.424.05.02.2659
FA5.801.144.07.01.6938
SF_zero6.101.455.08.01.7252
D_PVA7.001.255.09.01.5527
f_PVA7.401.266.09.01.4286
S/B9.201.037.010.01.2116
E_STF10.201.818.014.01.0532
V_PVA11.000.8210.012.00.9666
E_PVA11.601.269.013.00.8899
f_STF12.600.7011.013.00.7050
L_STF14.500.7114.016.00.3510
D_STF14.700.9513.016.00.3769
L_PVA16.501.2715.019.00.2018
FA_zero17.001.1515.019.00.1475
STF_zero17.701.1616.020.00.1230
SP_zero19.100.7418.020.00.0334
PVA_zero19.400.8418.020.00.0368
Table A6. Feature importance rank consistency analysis across 10 repeated random splits (εc).
Table A6. Feature importance rank consistency analysis across 10 repeated random splits (εc).
FeatureMean RankSD RankBest RankWorst RankMean |SHAP|
S/B1.000.001.01.00.0851
V_PVA2.100.322.03.00.0320
FA3.500.973.06.00.0225
V_STF4.001.152.06.00.0190
f_PVA6.002.004.010.00.0147
D_PVA6.501.845.010.00.0136
SP6.601.174.08.00.0139
SF7.301.835.011.00.0126
W/B9.702.007.013.00.0094
SF_zero9.901.208.012.00.0099
D_STF12.302.988.016.00.0069
FA_zero12.502.329.016.00.0071
E_STF12.701.5710.015.00.0072
L_STF14.101.7911.017.00.0055
E_PVA14.102.0211.016.00.0055
f_STF14.201.8711.017.00.0058
STF_zero16.801.4014.019.00.0031
PVA_zero18.100.8817.020.00.0020
L_PVA19.100.7418.020.00.0010
SP_zero19.500.7118.020.00.0009
Note: Kendall’s W = 0.9716; chi-square = 184.61; p < 0.001.
Figure A2. Feature importance ranking heatmap for the compressive-strength (σc) model.
Figure A2. Feature importance ranking heatmap for the compressive-strength (σc) model.
Buildings 16 01927 g0a2
Figure A3. Feature importance ranking heatmap for the peak-strain (εc) model.
Figure A3. Feature importance ranking heatmap for the peak-strain (εc) model.
Buildings 16 01927 g0a3

Appendix A.5. Empirical Support for SHAP Dependence Regions (Density, Discrete Levels, and Tail Coverage) for Compressive Strength (σc) Using Train + Test Combined Data

Table A7. Empirical support for SHAP dependence regions for compressive strength (σc) using the combined train + test data.
Table A7. Empirical support for SHAP dependence regions for compressive strength (σc) using the combined train + test data.
Feature
(unit)
K (Rounded Levels)P5/P50/P95Tail n
(<P5/>P95)
Top-3 Levels (Share%)
V_STF (%)240/0.8/1.70/180 (21.4%); 1 (19.9%); 0.5 (11.6%)
f_PVA (MPa)81300/1560/185016/01560 (38.8%); 1600 (33.2%); 1620 (9.8%)
D_PVA (mm)70.02/0.04/0.042/100.04 (69.0%); 0.039 (9.8%); 0.02 (9.6%)
E_PVA (GPa)1130/40/42.813/1841 (32.0%); 40 (28.5%); 42.8 (9.8%)
V_PVA (%)250/0.5/20/10 (22.2%); 1 (19.1%); 0.5 (9.6%)
f_STF (MPa)13600/2000/28500/82800 (22.2%); 2000 (16.1%); 2850 (11.6%)
Note: All features have n = 397; P5, P50, and P95 denote the 5th, 50th (median), and 95th percentiles. Tail n reports the number of observations below P5 and above P95, Top-3 levels show the most frequent rounded values and their sample shares.

Appendix A.6. Sensitivity to the Number of Monte Carlo Samples

This study conducted a convergence analysis on the number of Monte Carlo samples, and the results are presented in Figure A4. The results show that when B ≥ 80, the fluctuation of the single-fiber main-effect curves remains below 2%, which satisfies the requirement for statistical stability. To strike a balance between computational efficiency and result reliability, B = 100 was ultimately selected for the subsequent analysis.
Figure A4. Convergence analysis of the number of Monte Carlo samples.
Figure A4. Convergence analysis of the number of Monte Carlo samples.
Buildings 16 01927 g0a4

Appendix A.7. Robustness Analysis of the Synergy-Gain Surface

A further local shape-robustness analysis was conducted for the σc synergy-gain surface using B = 100, as adopted in the main text. With the trained model, the combined train+test dataset, the two-dimensional grid range (0–2% × 0–2%), and the grid resolution (Δs = Δp = 0.125%) kept unchanged, pairwise comparisons were performed among the synergy-gain surfaces obtained with B = 80, 100, and 120.
The results show that the overall shapes of the synergy-gain surfaces across varying B values are highly consistent (see Table A8). The Pearson correlation coefficients are all above 0.9989, and the Spearman rank correlation coefficients are all above 0.9965. This indicates that the synergy pattern identified in the main text is not sensitive to small variations in B around 100.
Combined with the convex-hull-based data-support domain shown in Figure 9, the adopted 0–2% grid is broadly consistent with the main supported region of the current σc dataset. Therefore, using B = 100 in the main text provides a good balance between computational efficiency and map robustness. Further results on the shape robustness of the synergy-gain surface under different Monte Carlo sample sizes are presented in Table A9.
Table A8. Local shape-robustness check of the σc synergy-gain surface around the adopted Monte Carlo sample size (B = 100).
Table A8. Local shape-robustness check of the σc synergy-gain surface around the adopted Monte Carlo sample size (B = 100).
Pair of B ValuesSurface Correlation, Pearson rSpearman Rank Correlation, ρ
80 vs. 1000.9995 ± 0.00040.9974 ± 0.0010
100 vs. 1200.9996 ± 0.00040.9980 ± 0.0008
80 vs. 1200.9989 ± 0.00100.9965 ± 0.0018
Note: The comparison was performed under the same trained model, combined train+test dataset, grid range (0–2% × 0–2%), and grid resolution (17 × 17, Δs = Δp = 0.125%). Only the Monte Carlo sample size B was varied.
Table A9. Shape robustness of the σc synergy-gain surfaces around the adopted Monte Carlo sample size (B = 100).
Table A9. Shape robustness of the σc synergy-gain surfaces around the adopted Monte Carlo sample size (B = 100).
Pair of B ValuesSurface Correlation, Pearson rSpearman Rank Correlation, ρPositive-Window IoU
80 vs. 1000.9995 ± 0.00040.9974 ± 0.00100.3842 ± 0.0550
100 vs. 1200.9996 ± 0.00040.9980 ± 0.00080.3703 ± 0.0571
80 vs. 1200.9989 ± 0.00100.9965 ± 0.00180.3721 ± 0.0646
Note: The comparison was conducted using the same trained model, combined train+test dataset (N=397), grid range (0–2% × 0–2%), and grid resolution (17 × 17, Δs = Δp = 0.125%). Only the Monte Carlo sample size B was changed. In the present σc dataset, the convex-hull-based support domain coincided with the adopted full grid; therefore, the full-grid and support-domain statistics were numerically identical.

Appendix A.8. Empirical Support for SHAP Dependence Regions (Density, Discrete Levels, and Tail Coverage)

Table A10. Empirical support for SHAP dependence regions for peak strain (εc) using the combined train + test data.
Table A10. Empirical support for SHAP dependence regions for peak strain (εc) using the combined train + test data.
Feature
(unit)
K (Rounded Levels)P5/P50/P95Tail n
(<P5/>P95)
Top-3 Levels (Share%)
V_STF (%)190/0.8/1.50/81 (18.2%); 0 (18.2%); 0.5 (12.8%)
f_PVA (MPa)41300/1560/16200/01560 (40.4%); 1600 (31.5%); 1300 (14.3%)
D_PVA (mm)30.03/0.04/0.0410/00.04 (83.3%); 0.03 (11.8%); 0.02 (4.9%)
V_PVA (%)180/0.5/1.70/61 (20.7%); 0 (19.2%); 1.7 (11.8%)
D_STF (mm)60.2/0.2/0.750/90.2 (53.7%); 0.6 (20.2%); 0.75 (8.9%)
L_STF (mm)813/13/500/913 (53.7%); 36 (12.3%); 50 (8.9%)
Note: All features have n = 203; P5, P50, and P95 denote the 5th, 50th (median), and 95th percentiles. Tail n reports the number of observations below P5 and above P95, Top-3 levels show the most frequent rounded values and their sample shares.

References

  1. Zhou, Y.; Xiao, Y.; Gu, A.; Zhong, G.; Feng, S. Orthogonal experimental investigation of steel-PVA fiber-reinforced concrete and its uniaxial constitutive model. Constr. Build. Mater. 2019, 197, 615–625. [Google Scholar] [CrossRef]
  2. Liu, F.; Ding, W.; Qiao, Y. Experimental investigation on the tensile behavior of hybrid steel-PVA fiber reinforced concrete containing fly ash and slag powder. Constr. Build. Mater. 2020, 241, 118000. [Google Scholar] [CrossRef]
  3. Abbas, Y.M.; Hussain, L.A.; Khan, M.I. Constitutive Compressive Stress-Strain Behavior of Hybrid Steel-PVA High-Performance Fiber-Reinforced Concrete. J. Mater. Civ. Eng. 2022, 34, 04021401. [Google Scholar] [CrossRef]
  4. Wu, J.; Zhang, W.; Han, J.; Liu, Z.; Liu, J.; Huang, Y. Experimental Study on the Flexural Performance of Steel–Polyvinyl Alcohol Hybrid Fiber-Reinforced Concrete. Materials 2024, 17, 3099. [Google Scholar] [CrossRef] [PubMed]
  5. Kang, M.C.; Yoo, D.Y.; Gupta, R. Machine learning-based prediction for compressive and flexural strengths of steel fiber-reinforced concrete. Constr. Build. Mater. 2021, 266, 121117. [Google Scholar] [CrossRef]
  6. Al-Shamasneh, A.R.; Mahmoodzadeh, A.; Karim, F.K.; Saidani, T.; Alghamdi, A.; Alnahas, J.; Sulaiman, M. Application of machine learning techniques to predict the compressive strength of steel fiber reinforced concrete. Sci. Rep. 2026, 16, 1901. [Google Scholar] [CrossRef] [PubMed]
  7. Sofos, F.; Papakonstantinou, C.G.; Valasaki, M.; Karakasidis, T.E. Fiber-reinforced polymer confined concrete: Data-driven predictions of compressive strength utilizing machine learning techniques. Appl. Sci. 2022, 13, 567. [Google Scholar] [CrossRef]
  8. Cui, R.; Yang, H.; Li, J.; Xiao, Y.; Yao, G.; Yu, Y. Machine learning-based prediction of compressive strength in circular FRP-confined concrete columns. Front. Mater. 2024, 11, 1408670. [Google Scholar] [CrossRef]
  9. Zhang, J.; Huang, Y.; Wang, Y.; Ma, G. Multi-objective optimization of concrete mixture proportions using machine learning and metaheuristic algorithms. Constr. Build. Mater. 2020, 253, 119208. [Google Scholar] [CrossRef]
  10. Zhang, J.; Huang, Y.; Ma, G.; Nener, B. Mixture optimization for environmental, economical and mechanical objectives in silica fume concrete: A novel framework based on machine learning and a new meta-heuristic algorithm. Resour. Conserv. Recycl. 2021, 167, 105395. [Google Scholar] [CrossRef]
  11. Fan, M.; Li, Y.; Shen, J.; Jin, K.; Shi, J. Multi-objective optimization design of recycled aggregate concrete mixture proportions based on machine learning and NSGA-II algorithm. Adv. Eng. Softw. 2024, 192, 103631. [Google Scholar] [CrossRef]
  12. Li, W. Study on Mechanical Properties of Steel–PVA Hybrid Fiber Reinforced Cementitious Composites. Master’s Thesis, Guangxi University, Nanning, China, 2024. (In Chinese) [Google Scholar]
  13. Wang, Z. Studies on Mechanical Performance of Polyvinyl Alcohol-Steel Hybrid Fiber Reinforced Cementitious Composites. Ph.D. Thesis, Tsinghua University, Beijing, China, 2016. (In Chinese) [Google Scholar] [CrossRef]
  14. Wang, Z.; Zhang, J.; Wang, Q. Mechanical properties and crack width control of hybrid fiber reinforced ductile cementitious composites. J. Build. Mater. 2018, 21, 216–221+227. (In Chinese) [Google Scholar] [CrossRef]
  15. Sun, L.; Hao, Q.; Zhao, J.; Wu, D.; Yang, F. Stress strain behavior of hybrid steel-PVA fiber reinforced cementitious composites under uniaxial compression. Constr. Build. Mater. 2018, 188, 349–360. [Google Scholar] [CrossRef]
  16. Liu, W.; Han, J. Experimental Investigation on Compressive Toughness of the PVA-Steel Hybrid Fiber Reinforced Cementitious Composites. Front. Mater. 2019, 6, 108. [Google Scholar] [CrossRef]
  17. Liu, W.; Xu, A.; Han, J. Experimental study on the compressive behavior of PVA–steel hybrid fiber reinforced cementitious composites. J. Heilongjiang Univ. Technol. (Compr. Ed.) 2024, 24, 121–128. (In Chinese) [Google Scholar]
  18. Hao, Q. Research on the Constitutive Model of Steel–PVA Hybrid Fiber Reinforced Cementitious Composites. Master’s Thesis, Wenzhou University, Wenzhou, China, 2017. (In Chinese) [Google Scholar]
  19. Zhong, G.; Zhou, Y.; Xiao, Y. Study on the uniaxial stress–strain curve of steel–polyvinyl alcohol hybrid fiber concrete. Eng. Mech. 2020, 37, 111–120. (In Chinese) [Google Scholar] [CrossRef]
  20. Liu, Y.N.; Li, H.; Li, H.W. Experimental study and constitutive modeling of fine steel fiber/PVA hybrid cement-based composites under uniaxial compression. Chin. Q. Mech. 2021, 42, 317–325. (In Chinese) [Google Scholar] [CrossRef]
  21. Kuang, W.; Tan, Z.; Li, Y.; Li, X.; Liu, F. Study on the compressive behavior of steel–PVA fiber high-strength manufactured-sand concrete. Guangzhou Archit. 2025, 53, 71–77. (In Chinese) [Google Scholar]
  22. Hu, J. Study on the Mechanical Properties of Steel–Polyvinyl Alcohol Hybrid Fiber Reinforced Cementitious Composites. Master’s Thesis, Kunming University of Science and Technology, Kunming, China, 2023. (In Chinese) [Google Scholar]
  23. Zhao, X. Study on the Mechanical Properties of PVA–Steel Fiber Reinforced Cement-Based Materials. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 2020. (In Chinese) [Google Scholar]
  24. Gao, C. Experimental Study on Mix Proportion and Material Properties of PVA–Steel Hybrid Fiber Reinforced Cementitious Composites. Master’s Thesis, Lanzhou University of Technology, Lanzhou, China, 2022. (In Chinese) [Google Scholar]
  25. Sree, K.S.S.; Koniki, S. Mechanical Properties of PVA & Steel Hybrid Fiber Reinforced Concrete. E3S Web Conf. 2021, 309, 01174. [Google Scholar] [CrossRef]
  26. Ju, Y.; Zhu, M.; Zhang, X.; Wang, D. Influence of steel fiber and polyvinyl alcohol fiber on properties of high performance concrete. Struct. Concr. 2022, 23, 1687–1703. [Google Scholar] [CrossRef]
  27. Zhang, X.; Wang, B.; Ju, Y.; Wang, D.; Zhu, M. Experimental Study and New Model for Flexural Parameters of Steel–PVA High-Performance Fiber–Reinforced Concrete. J. Mater. Civ. Eng. 2023, 35, 04023016. [Google Scholar] [CrossRef]
  28. Sanchayan, S.; Foster, S.J. High temperature behaviour of hybrid steel–PVA fibre reinforced reactive powder concrete. Mater. Struct. 2016, 49, 769–782. [Google Scholar] [CrossRef]
  29. Xu, Q.; Jiang, X.; Zhang, Z.; Xu, C.; Zhang, J.; Zhou, B.; Hang, W.; Zheng, Z. Experimental study on residual mechanical properties of steel-PVA hybrid fiber high performance concrete after high temperature. Constr. Build. Mater. 2025, 458, 139735. [Google Scholar] [CrossRef]
  30. Wang, J. Experimental Study on the Effects of PVA Fiber and Steel Fiber on the Fracture Properties of High-Performance Fiber-Reinforced Cementitious Composites. Master’s Thesis, Beijing Jiaotong University, Beijing, China, 2011. [Google Scholar] [CrossRef]
  31. Zhang, P.; Deng, R.; Hu, J.; Wu, L.; Tao, Z. Flexural performance of steel–PVA hybrid fiber engineered cementitious composites. Bull. Chin. Ceram. Soc. 2023, 42, 3125–3134. (In Chinese) [Google Scholar] [CrossRef]
  32. Ding, Y. The Shock Compression Dynamic Performance Experimental Study of Steel and PVA Hybrid Fiber Reinforced Cement Matrix Composites. Master’s Thesis, South China University of Technology, Guangzhou, China, 2014. (In Chinese) [Google Scholar]
  33. Chen, G.; Lv, M.; Zhu, H.; Zhang, J.; Zhang, L. Towards compressive and tensile strengths of hybrid steel and PVA fibre-reinforced cementitious composites: Experimental and analytical. Case Stud. Constr. Mater. 2025, 22, e04301. [Google Scholar] [CrossRef]
  34. Li, S.; Ding, D.; He, S.; Lu, J.; Xiong, Z.; Wu, N. Research on fracture performance of steel–PVA hybrid fiber high-strength manufactured-sand concrete. Build. Struct. 2025, 55, 47–54. (In Chinese) [Google Scholar]
  35. Sun, J.; Zhao, Y.; Li, L.; Tian, L. Research on the influence of steel–PVA fiber volume fraction on the mechanical properties of concrete. Concrete 2025, 96–103. (In Chinese) [Google Scholar]
  36. BS EN 1992-1-1:2004; Eurocode 2: Design of Concrete Structures—Part 1-1: General Rules and Rules for Buildings. British Standards Institution (BSI): London, UK, 2004.
  37. Chen, P.; Liu, C.; Wang, Y. Size effect on peak axial strain and stress-strain behavior of concrete subjected to axial compression. Constr. Build. Mater. 2018, 188, 645–655. [Google Scholar] [CrossRef]
  38. Bolón-Canedo, V.; Remeseiro, B. Feature selection in image analysis: A survey. Artif. Intell. Rev. 2020, 53, 2905–2931. [Google Scholar] [CrossRef]
  39. Kabir, H.; Garg, N. Machine learning enabled orthogonal camera goniometry for accurate and robust contact angle measurements. Sci. Rep. 2023, 13, 1497. [Google Scholar] [CrossRef]
  40. Kazemi, F.; Özyüksel Çiftçioğlu, A.; Shafighfard, T.; Asgarkhani, N.; Jankowski, R. RAGN-R: A multi-subject ensemble machine-learning method for estimating mechanical properties of advanced structural materials. Comput. Struct. 2025, 308, 107657. [Google Scholar] [CrossRef]
  41. Xiao, S.; Yang, J.; Liu, Z.; Yang, W.; He, J. Effects of steel fiber content on compressive properties and constitutive relation of ultra-high performance shotcrete (UHPSC). Buildings 2024, 14, 1503. [Google Scholar] [CrossRef]
Figure 1. CDF comparison of compressive strength (σc) for train and test sets.
Figure 1. CDF comparison of compressive strength (σc) for train and test sets.
Buildings 16 01927 g001
Figure 2. Technical roadmap of the five-stage machine learning framework for analyzing fiber-reinforced concrete performance.
Figure 2. Technical roadmap of the five-stage machine learning framework for analyzing fiber-reinforced concrete performance.
Buildings 16 01927 g002
Figure 3. Global feature importance ranking for σc prediction (train vs. test).
Figure 3. Global feature importance ranking for σc prediction (train vs. test).
Buildings 16 01927 g003
Figure 4. Fiber-related feature importance ranking for σc prediction.
Figure 4. Fiber-related feature importance ranking for σc prediction.
Buildings 16 01927 g004
Figure 5. SHAP dependence plots and marginal histograms of key steel-fiber and PVA-fiber variables for the σc task.
Figure 5. SHAP dependence plots and marginal histograms of key steel-fiber and PVA-fiber variables for the σc task.
Buildings 16 01927 g005
Figure 6. Single-fiber main-effect curves for σc under Monte Carlo marginalization (B = 100).
Figure 6. Single-fiber main-effect curves for σc under Monte Carlo marginalization (B = 100).
Buildings 16 01927 g006
Figure 7. Bivariate partial dependence plot showing the interaction of steel-fiber and PVA-fiber volume fractions in the prediction of compressive strength.
Figure 7. Bivariate partial dependence plot showing the interaction of steel-fiber and PVA-fiber volume fractions in the prediction of compressive strength.
Buildings 16 01927 g007
Figure 8. Mean synergy-gain surface Δ ¯ (s, p) for σc with the Δ ¯ = 0 boundary and the maximum point marked.
Figure 8. Mean synergy-gain surface Δ ¯ (s, p) for σc with the Δ ¯ = 0 boundary and the maximum point marked.
Buildings 16 01927 g008
Figure 9. Overlay of the σc synergy boundary and the convex-hull-based data-support domain.
Figure 9. Overlay of the σc synergy boundary and the convex-hull-based data-support domain.
Buildings 16 01927 g009
Figure 10. Global feature importance ranking for εc prediction (train vs. test).
Figure 10. Global feature importance ranking for εc prediction (train vs. test).
Buildings 16 01927 g010
Figure 11. Fiber-related feature importance ranking for εc prediction.
Figure 11. Fiber-related feature importance ranking for εc prediction.
Buildings 16 01927 g011
Figure 12. SHAP dependence plots of key variables for εc.
Figure 12. SHAP dependence plots of key variables for εc.
Buildings 16 01927 g012
Figure 13. Mean synergygain surface and datasupport overlay for εc: (a) mean synergy-gain heatmap; (b) overlay of the Δ ¯ = 0 boundary and the convex-hull-based datasupport domain.
Figure 13. Mean synergygain surface and datasupport overlay for εc: (a) mean synergy-gain heatmap; (b) overlay of the Δ ¯ = 0 boundary and the convex-hull-based datasupport domain.
Buildings 16 01927 g013
Figure 14. Overlay of the σc and εc synergy windows with dual-objective contours.
Figure 14. Overlay of the σc and εc synergy windows with dual-objective contours.
Buildings 16 01927 g014
Figure 15. Pareto-front-based trade-off analysis between predicted compressive strength (σc) and peak strain (εc), including the training-set Pareto front, test-set re-evaluation, and representative candidate mixtures.
Figure 15. Pareto-front-based trade-off analysis between predicted compressive strength (σc) and peak strain (εc), including the training-set Pareto front, test-set re-evaluation, and representative candidate mixtures.
Buildings 16 01927 g015
Table 1. Core literature sources and sample counts of the σc dataset.
Table 1. Core literature sources and sample counts of the σc dataset.
No.Literature SourcesNumber of SpecimensProportion of Dataset
1Zhou et al. (2018) [1]174.28%
2Abbas et al. (2022) [3]194.79%
3Li (2024) [12]184.53%
4Wang (2016) [13]205.04%
5Sun et al. (2018) [15]246.05%
6Liu et al. (2019) [16]194.79%
7Liu et al. (2024) [17]225.54%
8Hao et al. (2025) [18]276.80%
9Zhong et al. (2020) [19]174.28%
10Zhao (2020) [23]164.03%
11Gao (2022) [24]369.07%
12Ju et al. (2022) [26]174.28%
13Zhang et al. (2023) [27]164.03%
14Wang et al. (2011) [30] 246.05%
15Chen et al. (2025) [33] 143.53%
16Sun et al. (2025) [35] 256.30%
Note: The full database contains 26 literature sources, with those contributing less than 3.5% not listed.
Table 2. Definition of input and target variables.
Table 2. Definition of input and target variables.
Feature CategoryAbbreviationPhysical Meaning (Unit)
Cementitious materialFAFly_Ash content (%)
Binary indicator (0/1)FA_zeroFly Ash Addition Marker
Cementitious materialSFSilica_Fume content (%)
Binary indicator (0/1)SF_zeroSilica_Fume Addition Marker
Mix-proportion parameterW/BWater to Binder Ratio (-)
Mix-proportion parameterS/BSand to Binder Ratio (-)
Chemical admixtureSPSuperplasticizer content (%)
Binary indicator (0/1)SP_zeroSuperplasticizer Addition Marker
Steel-fiber parameterD_STFSteel Fiber Diameter (mm)
Steel-fiber parameterL_STFSteel Fiber Length(mm)
Steel-fiber parameterf_STFSteel Fiber Tensile Strength (MPa)
Steel-fiber parameterE_STFSteel Fiber Elastic Modulus (GPa)
Steel-fiber parameterV_STFSteel Fiber Volume Fraction (%)
Binary indicator (0/1)STF_zeroSteel Fiber Addition Marker
PVA-fiber parameterD_PVAPVA Fiber Diameter (mm)
PVA-fiber parameterL_PVAPVA Fiber Length (mm)
PVA-fiber parameterf_PVAPVA Fiber Tensile Strength (MPa)
PVA-fiber parameterE_PVAPVA Fiber Elastic Modulus (GPa)
PVA-fiber parameterV_PVAPVA Fiber Volume Fraction (%)
Binary indicator (0/1)PVA_zeroPVA Fiber Addition Marker
Target variableσcCompressive Strength (MPa)
Target variableεcPeak Strain (%)
Note: 1. The percentages of FA and SF are calculated based on the mass of cement; 2. The volume fractions of V_STF and V_PVA are determined by the total volume of concrete; 3. σc denotes the target compressive strength; 4. εc denotes the target peak strain.
Table 3. Conversion coefficients for specimen-size/shape normalization.
Table 3. Conversion coefficients for specimen-size/shape normalization.
NO.Specimen TypeNon-Standard DimensionsConversion Coefficient
1Cube70.7 mm0.95
2Cube100 mm0.97
3Cube150 mm1.00
4Cylinder100 (d) × 200 (h)1.00
5Cylinder150 (d) × 300 (h)1.05
Note: Since no explicit size-effect conversion standard is available for peak strain, this study used the same conversion coefficients as those listed for compressive strength in this table to approximately normalize the peak-strain records [37].
Table 4. Statistical description of key variables in the database.
Table 4. Statistical description of key variables in the database.
ParameterSample SizeMaximumMinimumMeanMedian
FA (%)3970.7000.0000.2760.200
SF (%)3970.2000.0000.0440.000
W/B3970.5500.1760.3420.315
S/B3972.2400.2000.9611.000
SP (%)3970.0500.0000.0100.008
D_STF (mm)3970.8200.0750.3260.200
L_STF (mm)39758.00013.00020.71313.000
f_STF (MPa)3973100.000600.0002055.4662000.000
E_STF (GPa)397220.000180.000204.345200.000
V_STF (%)3972.0000.0000.7390.800
D_PVA (mm)3970.0600.0150.0370.040
L_PVA (mm)39712.0008.00011.56712.000
f_PVA (MPa)3971850.000800.0001552.2171560.000
E_PVA (GPa)39743.00029.00039.24740.000
V_PVA (%)3972.0000.0000.7560.500
σc (MPa)397173.08414.45754.47349.305
εc (%)2031.2340.1690.4410.356
Note: Sample size is reported as number of records. All specimens were subjected to standard curing for 28 days.
Table 5. Performance comparison of candidate models for σc prediction under the internal-validation scenario.
Table 5. Performance comparison of candidate models for σc prediction under the internal-validation scenario.
ModelR2MAE (MPa)RMSE (MPa)
Multiple Linear Regression0.91215.99247.8404
Random Forest0.97032.97554.5608
Extra Trees0.96763.16634.7587
XGBoost0.97482.81364.2024
LightGBM0.97832.86333.8940
CatBoost0.97372.74094.2857
Table 6. Test-set performance comparison of candidate models for εc prediction.
Table 6. Test-set performance comparison of candidate models for εc prediction.
Modeling Strategy (Test Set)R2MAERMSE
Baseline CatBoost0.95750.02650.0399
Transfer-learning model0.96500.02910.0363
Bayesian-optimized CatBoost0.96590.02180.0358
Table 7. Quantitative summary of the σc mean synergy-gain surface.
Table 7. Quantitative summary of the σc mean synergy-gain surface.
MetricSymbol/SettingValue
Grid range(Steel × PVA)s,p[0.0,2.0]% × [0.0,2.0]%
Grid resolutionN × N,Δ17 × 17, Δs = Δp = 0.125%
Monte Carlo samplesB100
Global maximum mean synergy gainmax Δ ¯ 4.794912 MPa
Location of max mean synergy gain(s*,p*)(1.875%, 2.000%)
Positive-synergy coverage (area share)P( Δ ¯ > 0)1.7%
Mean synergy gain within positive regionE[ Δ ¯ | Δ ¯ >0]3.271014 MPa
Global mean synergy gainE[ Δ ¯ ]−2.117415 MPa
Note: Δ(s,p) = f(s,p) − f(s,0) − f(0,p) + f(0,0). Δ ¯ (s,p) is the Monte Carlo average over B = 100 samples drawn from the combined train + test dataset (n = 397), while varying only the fiber volume fractions on the grid.
Table 8. Discrete levels and dominant-level shares of key fiber-related variables in the εc interpretation (test set, n = 41).
Table 8. Discrete levels and dominant-level shares of key fiber-related variables in the εc interpretation (test set, n = 41).
FeatureRange (Test)Levels (Test)Top-1 ShareTop-2 Share
V_PVA (%)0.00–1.701122.0%44.0%
D_PVA (mm)0.020–0.040480.5%90.2%
V_STF (%)0.00–2.001024.4%46.3%
f_PVA (MPa)1300–1620453.7%85.4%
D_STF (mm)0.200–0.820656.1%78.0%
L_STF (mm)13–58856.1%70.7%
Note: Shares are reported for the test set to match the SHAP dependence plots (computed on X_test).
Table 9. Quantitative summary of the εc mean synergy-gain surface.
Table 9. Quantitative summary of the εc mean synergy-gain surface.
StatisticValue
Monte Carlo samples (B)100
Grid resolution17 × 17 (0–2% × 0–2%)
Max mean synergy gain, max Δ ¯ 0.0141629 (εc units)
Location of max Δ ¯ Steel = 0.38%, PVA = 1.62%
Area fraction with Δ ¯ > 017.99%
Mean Δ ¯ over Δ ¯ > 0 region0.00412705
Mean Δ ¯ over all grid points−0.0332134
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, M.; Chen, J.; Zhou, S. Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete. Buildings 2026, 16, 1927. https://doi.org/10.3390/buildings16101927

AMA Style

Liu M, Chen J, Zhou S. Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete. Buildings. 2026; 16(10):1927. https://doi.org/10.3390/buildings16101927

Chicago/Turabian Style

Liu, Maojun, Junwen Chen, and Shengkai Zhou. 2026. "Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete" Buildings 16, no. 10: 1927. https://doi.org/10.3390/buildings16101927

APA Style

Liu, M., Chen, J., & Zhou, S. (2026). Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete. Buildings, 16(10), 1927. https://doi.org/10.3390/buildings16101927

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop