1. Introduction
Steel and PVA fibers play complementary roles in toughening concrete. Steel fibers mainly improve post-cracking load resistance and energy absorption, whereas PVA fibers are more effective in controlling microcracks and enhancing ductility. When the two fibers are properly combined in terms of properties and dosage, they may work together to improve both strength and deformation capacity. However, hybrid fiber systems involve the interaction of many factors, such as matrix composition, fiber geometry, mechanical properties, and fiber content. Because of this complexity, conventional experiments alone are often not sufficient to reveal the underlying interaction patterns in a systematic way.
In recent years, experimental studies on the mechanical performance of hybrid steel–PVA fiber-reinforced concrete have steadily increased. Zhou et al. [
1] systematically investigated its uniaxial compressive constitutive behavior through an orthogonal experimental design and confirmed that the combined use of steel and PVA fibers can significantly improve failure behavior and energy dissipation. Liu et al. [
2] reported that the hybrid-fiber effect is jointly governed by matrix mix proportion and fiber volume fraction. Abbas et al. [
3] developed a compressive stress–strain constitutive model for hybrid steel–PVA fiber-reinforced concrete and quantified the effects of fiber parameters on peak stress and peak strain. Wu et al. [
4] further demonstrated, from the perspective of flexural behavior, the synergistic advantages of hybrid fibers in post-cracking toughness and deformation capacity. These studies provide an important basis for understanding the reinforcing mechanisms of hybrid fibers. However, because of the limited experimental scope and sample size, it is still difficult to fully capture the nonlinear interactions arising from multi-factor coupling.
With the wider use of data-driven methods in building materials research, machine learning has become a common tool for predicting concrete performance and exploring hidden patterns in the data. Kang et al. [
5] showed that tree-based models can capture the nonlinear behavior of fiber-reinforced concrete with good accuracy. Al-Shamasneh et al. [
6] reported that ensemble learning performs robustly in predicting the compressive strength of steel fiber-reinforced concrete. Sofos et al. [
7] and Cui et al. [
8] further demonstrated that machine learning is applicable to complex material–structure problems involving FRP-confined concrete and related members.
Even so, most existing studies still use machine learning mainly as a black-box predictor. Much less attention has been given to questions that matter more in engineering practice, such as whether synergistic enhancement really exists, where it appears in terms of fiber dosage, and whether such patterns are backed by sufficient data.
Although machine learning has shown strong predictive ability in concrete research, most existing studies still treat the model as a black-box predictor. They pay much less attention to engineering questions that are more practically relevant, such as whether synergistic enhancement exists, in which volume-fraction ranges it occurs, and whether the observed pattern is supported by sufficient data.
Against this limitation in existing studies, machine learning-driven multi-objective optimization has emerged as a key approach to overcoming the dimensional bottlenecks of experimental research. Zhang et al. [
9,
10,
11] applied Pareto fronts and metaheuristic algorithms to examine trade-offs among strength, economic performance and other indicators in normal, silica-fume and recycled-aggregate concrete. However, these studies have yet to address the synergistic optimization of key mechanical properties in hybrid steel–PVA fiber-reinforced concrete.
For engineering design, it is essential to clarify how fiber reinforcement works. In this study, synergy gain is defined as the super-additive effect of hybrid steel–PVA fibers relative to single-fiber reinforcement. Thus, the key question is not only whether the prediction is accurate. It also includes whether synergistic fiber enhancement exists, in which volume-fraction ranges it appears, and under what data-support conditions it can be interpreted with reasonable confidence.
Therefore, this study does not treat machine learning simply as a black-box predictor. Instead, it combines machine learning with interpretable analysis, marginal-response modeling, and synergy-gain quantification to build a knowledge-extraction framework for engineering-oriented screening. Using compressive strength (σc) and peak strain (εc) as two target indicators of load-bearing capacity and deformation capacity, respectively, this study further proposes an overlay strategy for dual-objective synergy windows. This strategy provides support for preliminary mix screening and for setting priorities in experimental validation.
The main contributions of this study are as follows: (1) an interpretable machine learning framework was established for the dual objectives of σc and εc; (2) the mean synergy-gain surface of steel and PVA fibers was defined and quantified based on Monte Carlo marginalization; and (3) the synergy boundaries of σc and εc were identified, and a dual-objective mix-proportion screening logic was constructed.
Compared with most existing studies, which focus mainly on accuracy comparison, single-indicator interpretation, or single-performance prediction, the main novelty of this study lies in the identification of dual-objective synergy windows and in the design-oriented conclusions provided for material selection and mix-proportion optimization in engineering practice.
4. Discussion
4.1. Model Performance and Analytical Positioning
As shown in
Section 3, the random cross-validation results based on the available data indicate that both the σc and εc tasks achieved high fitting and predictive accuracy. This suggests that the data-driven models established from mix-proportion parameters, fiber geometric parameters, and fiber mechanical parameters can effectively capture the main nonlinear relationships in the compressive behavior of hybrid steel–PVA fiber-reinforced concrete. More importantly, the core value of this study lies not only in its high predictive accuracy, but also in transforming the prediction results into interpretable, screenable, and practically useful information that can directly support engineering applications, mix-proportion optimization, and experimental design.
Building on the analytical framework this model provides, we will further investigate how steel–PVA hybrid fibers modulate the compressive performance of concrete. To improve sample utilization under small-sample conditions, all 397 samples were combined for the final visualization analysis. The stability assessment of the synergy-gain window is provided in
Appendix A.1.
4.2. Interpretability of Key Variables and the Differentiated Roles of Steel and PVA Fibers
SHAP analysis shows that although σc and εc are both indicators of compressive behavior, they are governed by different dominant factors. In the σc task, matrix mix-proportion factors and steel fiber volume fraction are more important. In the εc task, S/B and several PVA-related variables carry greater influence. This difference indicates that the load-bearing capacity and deformation capacity of hybrid fiber-reinforced concrete are not controlled by the same set of variables in the same manner. The former depends more on the load-bearing skeleton of the matrix and the bridging capacity across macrocracks. The latter is more sensitive to microcrack control, fiber–matrix interfacial interaction, and the effect of matrix volumetric proportioning on deformation compatibility.
From the perspective of material mechanisms, the higher elastic modulus and tensile strength of steel fibers make them more effective in post-cracking bridging and in delaying the propagation of macrocracks. This is why they exert a more direct strengthening effect in the σc model. By contrast, PVA fibers are more advantageous in suppressing microcrack initiation, improving the continuity of crack propagation, and enhancing deformation accommodation near the peak point. This is broadly consistent with the findings of previous experimental studies [
1,
2,
3,
4]. In other words, steel fibers and PVA fibers do not merely offer redundant reinforcement. Instead, they participate in the compressive failure process at different scales. This distinction forms the basic physical basis of hybridization, rather than simple superposition. Furthermore, the role of PVA fibers should not be understood only as crack bridging during loading. It also includes early restraint of shrinkage-induced microcracks during curing. Such control of initial defects may provide an important basis for the later increase in peak strain and the delayed propagation of cracks.
Furthermore, both the SHAP dependence plots and the single-fiber main-effect curves indicate that fiber effects are strongly nonlinear. As the volume fraction increases, the strengthening effect does not continue to grow at a constant rate. Instead, it often shows diminishing marginal returns, plateauing, or even local fluctuations. This suggests that, in a multi-source literature-based dataset, the potential performance gains associated with higher fiber content may be simultaneously limited by factors such as fiber dispersion, workability, interfacial bonding, and matrix compatibility. Therefore, fiber optimization cannot be achieved simply by increasing fiber dosage. More importantly, it calls for pinpointing the parameter ranges in which the positive effects of fibers can be stably observed under the support of the available data.
Therefore, the physical validity of the interpretable results was assessed using a cautious “mechanistic consistency + data support” strategy. A SHAP trend was used as a basis for engineering interpretation only when three conditions were met: it was consistent with material mechanisms, it was supported by sufficient sample density or discrete-level coverage, and it showed a similar direction in the single-fiber main-effect curves and synergy-gain maps. Local interactions with sparse samples or those close to the data boundary were treated as optimization directions that require further engineering validation.
4.3. Engineering Implications of Synergy-Gain Windows and Dual-Objective Trade-Offs
From the synergy-gain heatmaps generated via Monte Carlo marginalization and two-dimensional grid evaluation, we observe that the synergistic enhancement between steel fibers and PVA fibers does not hold across the entire volume-fraction domain, but instead shows clear regional characteristics. This insight carries important implications for engineering practice: hybrid fiber mixtures do not inherently outperform single-fiber systems or produce simple additive effects, and only specific fiber combinations can yield true super-additive benefits. Therefore, rather than adhering to the empirical assumption that combining steel and PVA fibers will necessarily improve performance, this study advocates for a window-based and condition-dependent design strategy.
For σc, the positive-synergy region is concentrated near combinations with high steel-fiber and high PVA-fiber contents, and its area share is very small, indicating that strength synergy is strongly localized. This means that if the primary engineering objective is to achieve higher load-bearing capacity, the formulations under consideration are likely to fall in the high-volume-fraction region. However, such regions are also often closer to the data boundary and more likely to be accompanied by reduced constructability, difficulties in fiber dispersion, and increased construction risk. Therefore, the interpretation of the strength-synergy window must consider both potential benefits and application risks, instead of fixating solely on the peak value. In particular, the high-dosage synergy region close to the data boundary should be regarded as a high-potential direction for further experiments at this stage. Targeted validation tests are still needed before engineering application.
In comparison, the positive-synergy region for εc is wider, and its peak is located near combinations with low-to-moderate steel-fiber content and moderate-to-high PVA-fiber content. This indicates that ductility synergy does not rely on an extremely high steel-fiber volume fraction. Instead, it is more likely to arise from the coordinated action of a moderate amount of steel fibers and a substantial volume of PVA fibers at different stages of crack development: the former provides the necessary macro-scale bridging capacity, whereas the latter improves microcrack control and deformation compatibility. This finding suggests that, in scenarios where ductility, energy dissipation, or peak-strain enhancement is the primary objective, a moderate rather than extreme steel-fiber content is prone to deliver stable benefits.
More importantly, the synergy windows of σc and εc do not fully overlap, which means that engineering design must inevitably address a trade-off between strength and ductility. A more reasonable strategy is to first define the minimum requirements for load-bearing capacity and deformation capacity according to the structural objective. Candidate points should then be prioritized within the dual-objective synergy region, while also remaining inside the data-support domain and relatively far from the convex-hull boundary. Combinations located near the boundary should be confirmed through additional experiments. In this way, the role of the synergy-window map is not to replace experiments, but to help guide them in a more targeted and efficient manner.
Therefore, this study recommends window-guided selection rather than point-based selection. When a candidate combination lies well within the synergy region and remains distant from the convex-hull boundary, it should be prioritized in laboratory mixing trials. By contrast, if a combination exhibits favorable mechanical performance only in terms of its peak value but lies close to the boundary, it is an oriented case rather than being directly recommended as a target mix proportion.
Therefore, under the screening logic combining the Pareto front and the synergy window, combinations located within the synergy region and far from the convex-hull boundary should be prioritized as starting points for trial mixing in engineering applications. If a combination is favorable only near the boundary, then it should be treated as a validation target rather than as a directly recommended engineering mix without further testing.
4.4. Scope of Applicability, Limitations, and Future Work
Although this study established a relatively complete integrated workflow for prediction, interpretation, and synergy identification, its scope of applicability still needs to be clearly defined. First, the data were compiled from multiple published studies and academic theses. Although data standardization and specimen-size normalization were performed, differences in metadata may still exist across studies, including raw-material sources, curing conditions, loading rates, specimen preparation procedures, and testing equipment. These factors were not fully structured and incorporated into the model. Therefore, the patterns learned by the model should be understood, to some extent, as empirical regularities averaged over a multi-source database, rather than as an exact reproduction of a single experimental system.
Second, to ensure the reliability of the conclusions, all findings in this study are restricted to the current data-support range, and the scope of applicability will be further expanded through targeted experiments. At the same time, convex-hull coverage was used to identify high-confidence regions within the supported data domain. The conclusions on synergy effects in these regions can directly inform engineering design. By contrast, conclusions for samples near the convex-hull boundary should be regarded only as a basis for preliminary design and still require further experimental validation. Fiber-mix schemes located in such high-confidence regions can be directly applied in the production of concrete members and may help reduce trial-mix costs to some extent.
To further extend the current application boundary of this study, future work will proceed in three directions. First, additional experiments will be carried out in boundary regions and sparse mix-proportion intervals where data coverage is insufficient. The focus will be on verifying the mechanical stability of systems with high steel-fiber and high PVA-fiber contents, so as to provide more refined mix guidance for the production of concrete members. Second, mix-validation experiments using different batches of raw materials will be conducted to clarify the extent to which raw-material variability affects mix performance, thereby providing a quantitative basis for material substitution in engineering practice. Third, long-term performance data under different curing regimes (such as the evolution of shrinkage-induced stress and cracking) will be added to establish performance-prediction models that better reflect field conditions and further enhance the engineering applicability of the present study. In addition, this pre-peak strain energy
is an important indicator for evaluating the seismic energy-dissipation potential of fiber-reinforced concrete. Previous studies have shown [
41] that
can be approximately estimated using empirical relationships related to
and
. Therefore, extending the prediction–interpretation–synergy-identification framework developed in this study to a three-objective analysis framework for
,
, and
would further improve the engineering applicability of the present findings.
From the perspective of engineering implementation, the most effective path for future improvement is not to develop more complex prediction models. Instead, it is to improve the transferability and practical usefulness of the conclusions in concrete-member production by supplementing validation experiments for boundary mix proportions and by introducing multidimensional constraints such as workability, durability, and cost.