1. Introduction
Concrete is one of the most widely used construction materials in modern infrastructure, where its mechanical properties directly influence structural safety, durability, and service life. Among these properties, compressive strength is widely regarded as the primary indicator of material performance and quality. This property is governed by multiple factors, including the proportions of cementitious materials, water content, aggregates, and curing conditions [
1]. These variables interact through complex physicochemical processes during hydration and hardening, resulting in highly nonlinear and interdependent relationships that make reliable prediction prior to experimental testing inherently challenging.
In conventional practice, compressive strength is determined through standardized laboratory experiments in which concrete specimens are tested after a predefined curing period, typically around 28 days [
2,
3]. While these procedures provide accurate and reproducible measurements, they introduce significant delays in design optimization and construction workflows. If the measured strength does not meet design requirements, additional trial mixtures must be prepared and retested, leading to increased material consumption, higher costs, and extended project timelines. From a data-driven perspective, this problem can be interpreted as a soft-sensing task, where a difficult-to-measure material property is inferred from observable mixture parameters.
With the increasing availability of experimental datasets and advances in computational modeling, machine learning has emerged as a powerful tool for capturing complex relationships in engineering systems. By learning from historical data, these methods can model nonlinear interactions among input variables that are difficult to represent using traditional empirical formulations [
4]. A wide range of machine learning techniques, including artificial neural networks, support vector machines, decision trees, and ensemble learning methods, have been applied to compressive strength prediction with promising results [
5]. More recent studies have demonstrated that ensemble and boosting-based models, particularly XGBoost variants, consistently achieve strong predictive performance, while hybrid and interpretable learning approaches have been increasingly adopted to enhance both accuracy and transparency in engineering applications, as evidenced by recent studies in structural engineering and physics-informed predictive modeling [
6,
7].
Despite these advances, several important limitations remain. First, many existing studies rely on standalone models or static hybrid combinations, which do not fully exploit complementary learning mechanisms across different regions of the feature space. Second, the interpretability of many predictive models remains limited, restricting their practical use in engineering decision-making where understanding the influence of mixture components is essential. Third, conventional feature selection approaches often prioritize relevance while neglecting redundancy and stability, which can lead to reduced robustness and sensitivity to data variability. Recent studies have increasingly focused on integrating interpretability and physics-based reasoning into machine learning models for civil and structural engineering applications. For example, Lazaridis et al. (2024) developed an interpretable machine learning framework for shear capacity prediction of strengthened masonry walls, demonstrating that combining ensemble learning with explainability techniques enables both high predictive accuracy and transparent identification of governing variables [
6]. Similarly, Zhu et al. (2026) proposed a physics-informed neural network (PINN)-based hybrid modeling strategy for predicting creep-induced camber in prestressed concrete structures, where a physics-driven baseline is complemented by a data-driven correction model to improve prediction consistency under real-world conditions [
7]. These studies highlight a clear trend toward hybrid modeling paradigms that combine data-driven learning with domain knowledge and interpretability. However, most existing approaches either focus primarily on post hoc explainability or rely on predefined hybrid structures with limited adaptability. In particular, the integration of multiple learning paradigms is often performed using static or globally optimized combinations, which restricts the ability of models to adapt to heterogeneous input characteristics. This limitation motivates the need for more flexible and structurally consistent learning frameworks that can simultaneously incorporate interpretability, physics-informed representation, and input-dependent model integration.
From a broader methodological perspective, an additional limitation arises from the way input variables are represented. In concrete mixtures, many parameters exhibit implicit structural relationships, particularly through ratio-based and scale-invariant interactions such as water–cement and water–binder ratios. These relationships reflect underlying physicochemical principles and suggest that predictive models should preserve invariance under proportional transformations of input variables. However, most existing machine learning approaches treat features independently and do not explicitly account for such structural properties, potentially leading to unstable or physically inconsistent representations.
Another challenge concerns the integration of different modeling paradigms. Attention-based deep learning models designed for tabular data can capture complex feature interactions, yet their performance is often sensitive to dataset size and structure. In contrast, ensemble-based approaches such as gradient boosting provide strong generalization but may lack expressive representation learning capabilities. These observations indicate that a more principled integration of complementary models, rather than simple stacking strategies, is required to achieve both robustness and reliability.
To address these challenges, this study introduces an Adaptive Symmetry-Aware Physics-Informed Hybrid (ASAPH) learning framework for predicting concrete compressive strength. The framework is designed around three complementary principles: preserving structural relationships among input variables, ensuring robustness through stability-aware feature selection, and enabling flexible model integration through adaptive learning. In contrast to conventional hybrid approaches that rely on static combinations, the proposed framework employs an input-dependent ensemble mechanism that dynamically adjusts the contribution of individual models based on the characteristics of each sample.
Within this framework, symmetry-consistent feature representations are incorporated to preserve invariant relationships among mixture parameters, while a stability-driven feature selection strategy is used to balance relevance and redundancy. The predictive model combines an attention-based tabular learner (TabNet) with a tree-based ensemble model (XGBoost), and integrates their outputs through an adaptive fusion mechanism. In addition, explainable artificial intelligence techniques are utilized to interpret model behavior and ensure consistency with established engineering knowledge.
The main contributions of this study can be summarized as follows:
A symmetry-aware feature representation strategy is introduced to preserve invariant relationships among concrete mixture parameters.
A stability-driven feature selection mechanism is developed with a relevance–redundancy formulation to enhance robustness and generalization.
An adaptive input-dependent ensemble learning framework is established to dynamically integrate complementary predictive models.
A physics-informed modeling perspective is incorporated to improve interpretability and ensure consistency with domain knowledge.
Comprehensive experimental analysis demonstrates improved predictive accuracy, robustness, and explainability compared to conventional machine learning and deep tabular models.
Unlike prior studies that typically focus on performance, interpretability, or feature engineering in isolation, this work presents a unified and principled framework that integrates these aspects within a single modeling pipeline. The results suggest that embedding structural properties and adaptive learning mechanisms into data-driven models provides a more reliable and interpretable approach for complex engineering systems.
The remainder of the paper is organized as follows.
Section 2 reviews related work on machine learning–based prediction of concrete compressive strength.
Section 3 presents the dataset and the proposed methodology.
Section 4 describes the experimental setup and evaluation strategy.
Section 5 discusses the results and interpretability analysis. Finally,
Section 6 concludes the study and outlines future research directions.
2. Related Work
The prediction of concrete compressive strength using computational approaches has received significant attention over the past decades. Due to the complex interactions among mixture components and curing conditions, traditional empirical models are often insufficient for accurately capturing the nonlinear relationships governing concrete strength. As a result, machine learning techniques have been widely explored as effective alternatives for modeling such complex engineering systems.
One of the earliest and most influential studies in this domain was conducted by Yeh [
8], who applied artificial neural networks to predict compressive strength based on mixture proportions. The results demonstrated that neural network models could successfully capture nonlinear relationships between input variables and compressive strength, establishing a strong foundation for subsequent research in data-driven concrete modeling.
Following this pioneering work, various machine learning methods have been investigated to improve prediction accuracy. Omran et al. [
9] evaluated multiple data mining techniques, including multilayer perceptron, support vector machines, Gaussian process regression, and decision tree–based models, reporting that Gaussian process regression achieved superior predictive performance. Similarly, Kiani et al. [
10] explored artificial intelligence techniques for predicting the compressive strength of foam cellular concrete and demonstrated that machine learning approaches outperform conventional empirical methods.
Subsequent studies focused on enhancing predictive performance through more advanced learning algorithms. Ashrafian et al. [
11] investigated several regression-based models, including multivariate adaptive regression splines, least squares support vector machines, and multilayer perceptron networks, and reported that multivariate adaptive regression splines achieved the highest prediction accuracy. Naderpour et al. [
12] further confirmed the effectiveness of artificial neural networks in estimating compressive strength, achieving low prediction errors in their experimental analysis.
In recent years, ensemble learning methods and boosting-based approaches have gained increasing attention due to their strong predictive capabilities. Yaseen et al. [
13] demonstrated that ensemble and hybrid learning strategies significantly improve prediction accuracy by combining the strengths of multiple models. Similarly, Ziolkowski [
14] investigated the impact of optimization algorithms and computational complexity on machine learning models, showing that both model architecture and training strategy significantly influence predictive performance, with Quasi-Newton-based models outperforming other optimization methods.
More recent studies have consistently highlighted the effectiveness of tree-based ensemble models, particularly XGBoost, in concrete strength prediction tasks. Shaaban et al. [
15] reported that XGBoost achieved superior performance (R
2 ≈ 0.94) in predicting high-strength concrete, demonstrating its capability to capture complex nonlinear relationships in structured datasets. Similarly, Miao et al. [
16] showed that XGBoost outperformed other models in predicting recycled aggregate self-compacting concrete strength while also employing SHAP analysis to interpret model predictions, identifying cement content and curing age as dominant factors. Recent studies in related domains have also highlighted the importance of capturing complex material behavior through data-driven modeling approaches. For example, Xue and Wu [
17] investigated the evolution of brittleness in hot dry rock under cyclic thermal treatment and demonstrated that nonlinear interactions between physical properties can be effectively characterized through data-driven indices. Similarly, Cai et al. [
18] analyzed crack propagation dynamics in shale under cyclic detonation loading, showing that fracture behavior emerges from complex and heterogeneous interactions influenced by multiple physical factors. These findings further support the necessity of adopting hybrid and adaptive modeling strategies capable of capturing diverse structural patterns in complex material systems.
The integration of deep learning and optimization techniques has further advanced predictive modeling in this domain. Yu et al. [
19] proposed a chemistry-informed convolutional neural network optimized using an enhanced bat algorithm, achieving high predictive accuracy (R
2 up to 0.978) for geopolymer concrete. These results demonstrate the potential of combining domain knowledge with deep learning architectures to improve both prediction accuracy and model reliability.
Comparative studies have also examined the performance of various machine learning models across different concrete types. Nafiuzzaman et al. [
20] reported that artificial neural networks and support vector regression achieved high predictive accuracy (R
2 ≈ 0.97), while highlighting variability across models and suggesting the need for more robust ensemble approaches. Similarly, Anwar et al. [
21] found that artificial neural networks and boosting-based models outperform traditional methods in fly ash-based geopolymer concrete prediction tasks, further supporting the effectiveness of ensemble learning strategies.
Several studies have extended machine learning approaches to specialized and sustainable concrete systems. Javid et al. [
22] investigated fiber-reinforced concrete under varying temperature conditions and demonstrated that boosting-based models, particularly CatBoost, achieve high predictive performance. Fakharian et al. [
23] explored green concrete incorporating recycled materials, showing that machine learning models effectively capture relationships between mixture composition and strength even in limited datasets. Likewise, Sathiparan [
24] highlighted the potential of agricultural waste materials such as corncob ash in concrete, where XGBoost achieved superior performance and SHAP analysis provided insights into feature contributions.
The role of environmental and operational factors has also been investigated in recent studies. Fathy et al. [
25] analyzed the effect of elevated temperatures on concrete containing waste powders, demonstrating that XGBoost achieved near-perfect predictive performance while explainability techniques revealed the influence of temperature and material composition. Tuvayanond et al. [
26] evaluated machine learning models for ready-mix concrete production and showed that XGBoost provides a favorable balance between predictive accuracy and computational efficiency, making it suitable for practical industrial applications.
In addition to improving predictive performance, recent research has increasingly emphasized model interpretability. Nikoopayan et al. [
27] applied explainable artificial intelligence techniques, including SHAP and partial dependence plots, to identify key influencing factors such as water–cement ratio and curing age. Similarly, Lan [
28] proposed integrating optimization techniques with machine learning models to enhance both predictive accuracy and interpretability, addressing the limitations of conventional black-box models. In addition to these studies, Sah et al. [
29] conducted a comparative analysis of multiple machine learning models, including artificial neural networks, multiple linear regression, support vector machines, and regression trees for predicting concrete compressive strength. Their findings indicated that artificial neural networks achieved superior predictive performance compared to other models, demonstrating their capability in capturing complex nonlinear relationships among mixture parameters. However, the reliance on single predictive models limits their ability to fully represent diverse feature interactions, highlighting the need for more robust ensemble and hybrid learning approaches.
Furthermore, Altuncı [
30] presented a comprehensive scientometric analysis of machine learning-based studies on concrete compressive strength prediction, examining more than two thousand publications. The study identified major research trends, dominant modeling techniques, and key limitations in the literature. The findings emphasize the increasing adoption of data-driven approaches while also highlighting the need for more reliable, interpretable, and generalizable predictive frameworks. These observations reveal significant research gaps in integrating interpretability, robustness, and hybrid modeling strategies within a unified framework.
Overall, the existing body of literature demonstrates a clear transition from traditional single-model approaches toward ensemble-based, optimization-driven, and deep learning methods for concrete compressive strength prediction. While these studies have significantly improved predictive performance, most of them address individual aspects of the problem in isolation, such as accuracy, interpretability, or model complexity. As a result, important challenges remain in achieving a balanced integration of robustness, interpretability, and structural consistency within a unified modeling framework.
A critical limitation observed across prior studies is the reliance on static model structures and fixed integration strategies. Although hybrid and ensemble methods have been explored, they are typically based on predefined or globally optimized combinations, which do not adapt to variations in input characteristics. For instance, Ni et al. [
31] proposed a stacking-based fusion framework for dynamic demand forecasting, demonstrating the effectiveness of combining multiple models through predefined integration strategies. However, such approaches rely on fixed or globally optimized combinations, which limits their ability to adapt to input-dependent variations. This restricts their ability to effectively capture heterogeneous patterns across different regions of the feature space. In addition, existing approaches generally overlook the implicit structural relationships among input variables, particularly ratio-based and scale-invariant interactions that reflect underlying physicochemical processes in concrete mixtures.
Another important gap lies in the treatment of feature selection. Most conventional approaches focus primarily on feature relevance, while neglecting redundancy and stability. This may lead to models that are sensitive to data variability and prone to overfitting, especially when dealing with relatively small and structured engineering datasets. Furthermore, although explainable artificial intelligence techniques have been increasingly adopted, they are often applied as a post hoc analysis rather than being integrated into the modeling framework itself.
In contrast to these approaches, this study introduces a unified learning framework that explicitly addresses these limitations through the integration of three complementary principles: symmetry-aware feature representation, stability-driven feature selection, and adaptive model integration. The framework departs from conventional static hybrid models by employing an input-dependent ensemble mechanism, allowing the model to dynamically adjust the contribution of different learners based on the characteristics of each sample.
Unlike conventional stacking-based hybrid models that rely on fixed or globally optimized combinations, the proposed framework introduces an input-dependent adaptive fusion mechanism, enabling instance-level model integration. This design allows the model to dynamically adjust its predictive behavior across different regions of the feature space, which constitutes a fundamental departure from static ensemble strategies.
By incorporating symmetry-consistent representations, the framework preserves invariant relationships among mixture parameters, ensuring physically meaningful feature interactions. The stability-driven feature selection mechanism provides a more robust feature space by balancing relevance and redundancy, while the adaptive learning strategy enables flexible and context-aware model integration. These components are not treated as independent modules but are systematically combined within a single coherent framework.
Therefore, the primary contribution of this work lies not in the combination of existing techniques, but in the formulation of a unified and adaptive learning framework that integrates structural consistency, stability-driven learning, and interpretability within a coherent modeling paradigm. Accordingly, the novelty of the study should be understood at the framework level, as it establishes a principled integration scheme that unifies representation, feature selection, and adaptive model interaction within a single learning architecture. This perspective provides a more general and transferable framework for modeling complex engineering systems beyond the specific application of concrete compressive strength prediction.
3. Materials and Methods
3.1. Dataset
The performance of data-driven models in engineering applications critically depends on the representativeness and structural richness of the underlying dataset. In the context of concrete compressive strength prediction, experimental datasets describing mixture compositions and their corresponding mechanical properties provide the basis for modeling complex material behavior.
In this study, the widely used concrete compressive strength dataset obtained from the UCI Machine Learning Repository was employed [
8,
32]. Originally introduced by Yeh [
8], this dataset has been extensively adopted as a benchmark for evaluating predictive models in concrete technology. Its continued use in the literature enables consistent comparison across different modeling approaches while providing a realistic representation of practical mixture design scenarios.
The dataset consists of 1030 samples, each corresponding to a distinct concrete mixture. For each sample, eight input variables are provided: cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and curing age. These variables collectively characterize both the composition and temporal evolution of concrete. The target variable represents the compressive strength obtained from standardized laboratory tests.
From a modeling perspective, compressive strength can be interpreted as a latent response variable that emerges from the interaction of mixture parameters and curing conditions. Since it cannot be measured in real time without destructive testing, its estimation naturally aligns with a soft-sensing formulation, where observable inputs are used to infer an underlying material property.
An important characteristic of this dataset is the presence of implicit structural relationships among input variables. Many mixture parameters are not independent but are linked through proportional and ratio-based interactions, reflecting underlying physicochemical constraints. These properties suggest that preserving such relationships during feature representation is essential for achieving physically consistent and stable predictions.
To better understand the distribution of the data, descriptive statistical analyses were performed, including measures such as mean, standard deviation, minimum, and maximum values for each variable. These statistics provide insight into the variability and scale differences among mixture components and support the design of subsequent preprocessing and modeling steps. To better characterize the dataset, descriptive statistical measures including mean, standard deviation, and range were analyzed for each variable. The results reveal substantial variability and scale differences across mixture components, highlighting the heterogeneous nature of concrete compositions. This variability further justifies the need for robust feature representation and adaptive modeling strategies employed in this study.
3.2. Feature Engineering and Data Preprocessing
The predictive capability of machine learning models is strongly influenced by the structure and expressiveness of the input feature space. In concrete mixtures, input variables are governed by underlying physicochemical processes and therefore exhibit interdependencies that cannot be fully captured using raw features alone. Treating these variables independently may lead to suboptimal representations and reduced model robustness.
To address this issue, a physics-informed and symmetry-aware feature engineering strategy was adopted. In addition to the original variables, a set of derived features was constructed to explicitly encode meaningful relationships among mixture components. In particular, ratio-based features such as the water–cement ratio and water–binder ratio were incorporated due to their well-established influence on strength development [
33]. These ratios represent scale-invariant relationships and play a central role in determining the hydration process and resulting mechanical properties.
Furthermore, normalized representations of mixture components were introduced to reflect relative proportions within the mixture.
Specifically, each mixture component x_i was transformed into a normalized proportion using (Equation (1)):
where
d denotes the number of mixture components and
ε is a small constant introduced to ensure numerical stability. This formulation ensures that the representation captures relative composition rather than absolute magnitude, which is more consistent with the underlying physical behavior of concrete mixtures.
This transformation enables the model to capture compositional balance rather than relying solely on absolute quantities. From a structural perspective, these representations help preserve symmetry properties in the data, ensuring that equivalent proportional changes in input variables lead to consistent model behavior. This distinction is particularly important in concrete technology, where proportional relationships among components govern strength development more directly than absolute values.
By embedding physically meaningful and symmetry-consistent relationships into the feature space, the resulting representation becomes more informative and better aligned with the underlying material behavior. This structured representation reduces the burden on learning algorithms to infer implicit relationships solely from data and facilitates more stable and interpretable predictions [
34].
Following feature construction, standard preprocessing steps were applied to ensure data consistency and numerical stability. The preprocessing pipeline consists of three main steps: (i) construction of physics-informed and ratio-based features, (ii) normalization of mixture components to capture proportional relationships, and (iii) feature scaling to standardize input distributions prior to model training. The dataset was first examined for missing or inconsistent values, and no anomalies were detected. Subsequently, feature scaling was performed to normalize the input variables. Since mixture parameters are expressed in different units and ranges, normalization prevents features with larger magnitudes from dominating the learning process and improves convergence behavior, particularly for attention-based and neural network models.
To obtain a reliable estimate of model performance, a k-fold cross-validation strategy was employed. In this approach, the dataset is partitioned into k subsets, and the model is iteratively trained and validated across different folds. Each subset is used once as a validation set, ensuring that all samples contribute to both training and evaluation. This procedure provides a robust assessment of generalization performance and reduces the risk of overfitting, especially in structured engineering datasets with limited sample size. The final feature set used in the model consists of the original mixture parameters, including cement, slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and curing age, as selected through the proposed stability-driven feature selection process.
3.3. Stability-Driven and Symmetry-Preserving Feature Selection
Feature selection in engineering-oriented machine learning problems extends beyond conventional dimensionality reduction, as it directly influences model robustness, generalization capability, and physical consistency. In the context of concrete compressive strength prediction, input variables are inherently interdependent due to physicochemical constraints governing hydration and mixture composition. These interdependencies introduce significant multicollinearity and redundancy, which may adversely affect model stability and predictive reliability if not properly addressed.
Traditional feature selection methods primarily focus on feature–target relevance while overlooking redundancy and structural dependencies among variables. As a consequence, the selected feature subsets may vary across different data partitions, leading to unstable model behavior and reduced generalization performance.
To address these limitations, this study proposes a stability-driven and symmetry-preserving feature selection framework, which integrates three complementary criteria: (i) predictive relevance, (ii) redundancy minimization, and (iii) structural consistency.
Let
denote the
-th feature and
the target variable. The predictive relevance of each feature is quantified as (Equation (2)):
where
represents a statistical scoring function (SelectKBest) that evaluates the dependency between
and
[
35].
Redundancy is quantified using the average pairwise correlation between the feature
and all other features (Equation (3)):
where
denotes the Pearson correlation coefficient between features
and
, and
is the total number of features [
36].
To incorporate structural information derived from domain knowledge, a symmetry-consistency term is introduced. This term captures the invariance of feature behavior under proportional transformations of mixture components. Let
denote a transformation preserving scale-invariant relationships. The structural consistency is defined as (Equation (4)):
where
is the number of samples and
is a small positive constant introduced for numerical stability. In this context, the transformation
T can be interpreted as a scale-preserving operation that maintains key physicochemical ratios governing concrete behavior.
To provide a clearer physical interpretation of the symmetry-consistency term, the transformation T is explicitly defined as a proportional scaling applied to key mixture components. Let x denote the input vector, and let T_k(x) represent a transformation in which the quantities of cement, water, fine aggregate, and coarse aggregate are simultaneously scaled by a constant factor k > 0, while other variables remain unchanged.
This transformation reflects a physically meaningful operation in concrete mixture design, where proportional scaling of primary constituents preserves relative composition ratios such as water–cement ratio, which are known to govern strength development. Under this transformation, a physically consistent predictive model is expected to produce outputs that remain stable, as the fundamental mixture proportions are preserved.
To quantify this behavior, the structural consistency term implicitly evaluates the deviation between model responses under the original input x and its transformed counterpart T_k(x). A small deviation indicates that the model respects scale-invariant relationships inherent in concrete mixtures. In practice, this corresponds to enforcing the condition that the change in predicted compressive strength remains bounded within a small tolerance under proportional scaling transformations.
This formulation establishes a direct connection between the symmetry constraint and domain-specific knowledge, ensuring that the feature selection process favors variables that exhibit physically meaningful invariance properties rather than purely statistical relevance. Although the formulation is primarily theoretical, the stability of model predictions under such proportional transformations was also observed empirically during experimental analysis, supporting the validity of the proposed symmetry-consistency constraint.
Based on these components, the final feature selection score is formulated as (Equation (5)):
From a theoretical standpoint, this formulation can be interpreted as a regularized optimization criterion in which predictive relevance and structural consistency promote feature utility, whereas redundancy acts as a penalizing term. This formulation can be interpreted as a structured optimization criterion that promotes features exhibiting high predictive relevance and strong structural consistency, while penalizing redundancy. From a theoretical perspective, the proposed formulation can be interpreted as a regularized optimization problem, where predictive relevance and structural consistency act as reward terms, while redundancy serves as a penalty term. This interpretation provides a principled justification for the feature selection mechanism beyond heuristic design. Unlike conventional approaches, the proposed method explicitly incorporates invariance properties derived from the underlying physical system.
As a result, the selected feature subset is not only informative but also physically meaningful and stable across different data partitions. To provide empirical evidence of the proposed formulation, the feature selection scores derived from Equations (2)–(5) were computed for all candidate features. Based on these scores, features were ranked and the top-k subset was retained for model training. This procedure ensures that the proposed formulation directly influences the final feature space used by the predictive models. The resulting feature subset includes both original mixture parameters and key derived ratio-based features, reflecting the balance enforced by the proposed criterion. In particular, features such as curing age, cement content, and water-related ratios consistently achieved high feature selection scores, indicating their strong predictive relevance and structural importance. These findings are aligned with the SHAP-based importance analysis presented in
Section 5, confirming the consistency between the proposed feature selection mechanism and the learned model behavior, both quantitatively and qualitatively. Although the proposed symmetry-consistency formulation is primarily theoretical, the stability of model predictions under such proportional transformations was empirically observed during the experimental analysis, supporting the validity of the underlying assumption. Overall, this integration establishes a clear connection between the theoretical formulation and its empirical impact, leading to improved robustness, enhanced generalization capability, and better alignment with the physicochemical behavior governing concrete strength development.
3.4. Proposed Adaptive Symmetry-Aware Physics-Informed Hybrid (ASAPH) Learning Framework
Machine learning models often exhibit performance variability depending on dataset characteristics and the complexity of the underlying relationships. In engineering systems such as concrete mixtures, interactions among input variables are highly nonlinear and heterogeneous across the feature space. Consequently, relying on a single predictive model may be insufficient to capture all relevant patterns.
To address this challenge, this study proposes an adaptive symmetry-aware hybrid learning framework, which dynamically integrates complementary predictive models through an input-dependent fusion mechanism. The proposed framework consists of three main stages: (i) physics-aware feature representation, (ii) stability-driven and symmetry-preserving feature selection, and (iii) adaptive hybrid prediction. The overall architecture is illustrated in
Figure 1.
In the final stage, two complementary base learners are employed: an attention-based tabular model (TabNet) and a tree-based ensemble model (XGBoost). TabNet captures complex nonlinear feature interactions through sequential attention mechanisms, enabling sparse and interpretable feature selection tailored for tabular data [
37,
38], while XGBoost provides a highly efficient and regularized gradient boosting framework that has become a standard benchmark for structured data modeling due to its scalability and strong generalization performance [
39,
40]. These models exhibit complementary strengths, motivating their integration within a unified framework.
Unlike conventional stacking approaches, where model contributions are fixed, the proposed framework introduces an input-dependent adaptive fusion mechanism. The final prediction is computed as (Equation (6)):
subject to the constraint (Equation (7)):
where
and
denote the predictions of TabNet and XGBoost, respectively, and
,
are input-dependent weights. Accordingly, the connections shown in
Figure 2 represent the weighted contribution of each base learner to the fusion layer, where TabNet contributes through w_t(x)·f_t(x) and XGBoost contributes through w_x(x)·f_x(x).
The weights are defined using a learnable gating function (Equation (8)):
where
h(
x) is a learnable parametric function that maps the input features to a scalar score, and σ denotes the sigmoid activation function used to transform this score into a bounded weight in the range (0, 1). This adaptive mechanism allows the model to dynamically adjust the contribution of each base learner depending on the characteristics of the input sample. In this formulation, the gating function
h(
x) is modeled as a learnable parametric function that operates on the input feature vector. Specifically,
h(
x) is implemented as a lightweight feed-forward mapping that takes the embedded feature representation as input and produces a scalar gating value. The parameters of this function are optimized jointly with the overall training objective in an end-to-end manner, using the same loss function defined for the predictive model. During training, the gradients from the prediction loss are backpropagated through the adaptive fusion layer to update both the base learners and the parameters of
h(
x). This joint optimization enables the gating function to learn input-dependent weighting strategies that minimize prediction error across different regions of the feature space. As a result, the model can dynamically adjust the relative contribution of TabNet and XGBoost based on the structural characteristics of each input sample. In practical terms, this means that the gating network is trained simultaneously with the hybrid prediction objective rather than being calibrated through a separate post hoc procedure. This formulation ensures that the weighting process is explicitly defined through a separate parametric mapping and avoids circular dependence in the controller representation. In this sense,
h(
x) acts as a gating function that learns a soft partitioning of the input space, enabling context-aware model selection. Consequently, the framework can capture heterogeneous patterns across different regions of the feature space, which is not possible with static ensemble methods. This adaptive fusion strategy fundamentally differs from conventional stacking and ensemble methods, where model contributions are fixed or globally optimized. By enabling instance-level weighting, the proposed framework provides a more flexible and context-aware integration mechanism, which significantly enhances modeling capacity in heterogeneous feature spaces. This design avoids the need for manually predefined weighting schemes and ensures that the fusion process remains fully data-driven and adaptive.
To further enforce structural consistency, a symmetry-based regularization term is introduced (Equation (9)):
where
represents a symmetry-preserving transformation of the input. The final training objective is defined as (Equation (10)):
where
is the prediction loss and
is a regularization coefficient controlling the influence of the symmetry constraint.
4. Experimental Setup
This section presents a rigorous experimental framework designed to evaluate the effectiveness, robustness, and generalization capability of the proposed symmetry-aware hybrid learning framework. The experimental design is structured not only to assess predictive accuracy but also to provide statistically grounded evidence of model reliability and comparative superiority.
The evaluation protocol includes dataset preparation, training strategy, benchmarking against state-of-the-art models, statistical validation, and implementation details. Particular attention is given to ensuring fairness, reproducibility, and methodological consistency across all experiments.
4.1. Experimental Design
The experimental analysis was conducted using the widely adopted concrete compressive strength dataset obtained from the UCI Machine Learning Repository, originally introduced by Yeh [
8]. The dataset contains 1030 samples, each representing a distinct concrete mixture defined by eight input variables and a corresponding compressive strength value.
Prior to model training, all input variables were normalized to eliminate scale inconsistencies across features. This step is particularly critical for attention-based and neural architectures, where gradient stability and convergence behavior are highly sensitive to feature scaling. To ensure fairness, the same preprocessing pipeline was consistently applied to all models.
To provide a robust estimate of generalization performance, a 10-fold cross-validation strategy was employed. In order to avoid potential information leakage in the adaptive fusion mechanism, a two-stage training strategy was adopted within each cross-validation fold. Specifically, the base learners (TabNet and XGBoost) were trained using the training subset, while the adaptive weighting function was trained on a separate validation split derived from the same fold. This separation ensures that the weight learning process does not directly access the training targets used for fitting the base models, thereby reducing overfitting risk and improving generalization. The dataset was partitioned into ten mutually exclusive subsets, where in each iteration, nine subsets were used for training and one subset for testing. This process was repeated across all folds, and the final results were obtained by averaging performance metrics over all folds. This approach reduces variance induced by random data splits and ensures statistically reliable evaluation.
To rigorously assess the effectiveness of the proposed framework, a diverse set of baseline models was selected, representing both classical and modern approaches for tabular learning:
These models were chosen to cover complementary modeling paradigms, including tree-based ensemble methods and deep learning architectures specifically designed for tabular data. This selection enables a comprehensive and unbiased comparison.
In addition to standalone models, the proposed hybrid framework was evaluated to quantify the contribution of its key components, namely stability-driven feature selection, symmetry-preserving representation, and adaptive model integration. By comparing the hybrid model against its individual base learners and advanced baselines, the experimental design isolates the effect of each modeling principle and provides a clear assessment of its added value.
All models were trained under identical experimental conditions, with hyperparameters determined through preliminary tuning guided by commonly adopted practices in the literature. A fixed random seed was used to ensure reproducibility.
Beyond average performance evaluation, statistical analyses were conducted to assess model stability and the significance of observed improvements. Standard deviation values across cross-validation folds were computed to quantify performance variability.
Furthermore, the Wilcoxon signed-rank test was employed to evaluate whether the performance differences between the proposed framework and baseline models were statistically significant. This non-parametric test is particularly suitable for paired comparisons derived from cross-validation folds. A significance level of 0.05 was adopted.
The statistical analysis confirms that the proposed framework consistently outperforms baseline models, with performance improvements over XGBoost and TabNet found to be statistically significant (p < 0.05). This indicates that the observed gains are systematic rather than due to random variation.
4.2. Evaluation Metrics
To comprehensively evaluate predictive performance, four widely accepted regression metrics were employed: the coefficient of determination (R2), mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE).
The coefficient of determination (R2) quantifies the proportion of variance in the target variable explained by the model, providing an overall measure of goodness-of-fit. Mean squared error (MSE) evaluates the average squared deviation between predicted and actual values, placing greater emphasis on larger errors. Root mean squared error (RMSE), as the square root of MSE, offers an interpretable error measure in the same unit as the target variable. Mean absolute error (MAE) captures the average magnitude of prediction errors and is less sensitive to outliers.
The combined use of these metrics provides a balanced evaluation of model performance, capturing both global accuracy and error distribution characteristics. All reported results correspond to the mean values obtained across the cross-validation folds.
4.3. Implementation Details
All models were implemented in a Python-based computational environment using established machine learning and deep learning frameworks. Specifically, Python 3.11 was used as the primary programming language, with numerical computations handled through NumPy [
41] and machine learning utilities supported by Scikit-learn [
42]. Deep learning models were implemented using the PyTorch framework (Version 2.2.2) [
43], while XGBoost was employed through its optimized gradient boosting library [
40]. All implementations were conducted using well-established and widely adopted open-source libraries to ensure reproducibility and methodological transparency. XGBoost was implemented using its official library, while deep learning models, including TabNet, TabTransformer, and FT-Transformer, were implemented using PyTorch-based frameworks.
The proposed framework consists of two complementary base learners and an adaptive integration mechanism:
TabNet (Base Learner): TabNet was employed as an attention-based model for tabular data, enabling sequential feature selection through sparse attentive masks [
37]. This mechanism allows the model to capture complex nonlinear interactions while maintaining interpretability. The model was trained with a batch size of 128, and early stopping was applied to prevent overfitting. The architectural configuration and hyperparameter settings of TabNet were carefully selected to balance model expressiveness and generalization performance. TabNet operates through a sequence of decision steps, each consisting of feature transformer and attentive transformer blocks, enabling sequential feature selection and sparse attention over the input features. In this study, the number of decision steps was set to 5, allowing the model to iteratively refine feature representations while maintaining computational efficiency. The feature dimension (n_d) and attentive dimension (n_a) were both set to 32, providing a balanced representation capacity without over-parameterization. The relaxation parameter (γ) was fixed at 1.5 to control feature reuse across decision steps, promoting both diversity and stability in feature selection. The sparsity coefficient (λ_sparse) was set to 1e−4 to encourage sparse feature masks while avoiding excessive regularization. The model was trained using the Adam optimizer with an initial learning rate of 0.01, and a batch size of 128. Early stopping with a patience of 20 epochs was applied based on validation loss to prevent overfitting. These hyperparameter values were determined through a controlled randomized search procedure, ensuring fair comparison with baseline models while maintaining model stability across cross-validation folds. This configuration ensures that TabNet effectively captures nonlinear feature interactions while maintaining interpretability through sparse feature selection masks.
XGBoost (Base Learner): XGBoost was utilized as a tree-based ensemble learner due to its strong performance on structured data [
39]. The final selected configuration for XGBoost corresponds to a balanced trade-off between model complexity and generalization performance. The model was configured with the following hyperparameters: number of estimators = 300, maximum depth = 4, learning rate = 0.05, subsample ratio = 0.9, and column sampling ratio = 0.9.
Adaptive Integration Strategy: Unlike conventional stacking approaches that rely on fixed model combinations, the proposed framework incorporates an adaptive integration mechanism. The predictions of TabNet and XGBoost are dynamically combined through a meta-learning function, enabling input-dependent weighting of model contributions. This design allows the framework to adjust to varying data characteristics and improves predictive flexibility.
Baseline Models: To ensure a comprehensive comparison, the following models were implemented:
XGBoost (standalone);
TabNet (standalone);
TabTransformer;
FT-Transformer.
These models represent state-of-the-art approaches for tabular learning and provide a strong benchmark for evaluation.
To ensure experimental reproducibility, all models were trained using a fixed random seed. Hyperparameter configurations were selected through controlled tuning, and all experiments were conducted under identical preprocessing and evaluation conditions. Reported results correspond to the average performance across the 10-fold cross-validation procedure.
To ensure a fair and unbiased comparison across all models, a systematic hyperparameter tuning procedure was employed. Specifically, a randomized search strategy was adopted for all models, with a fixed tuning budget to prevent uneven optimization advantages. For XGBoost, the hyperparameter search space included the number of estimators (100–500), maximum tree depth (3–8), learning rate (0.01–0.1), subsample ratio (0.7–1.0), and column sampling ratio (0.7–1.0). For TabNet, key hyperparameters such as the number of decision steps (3–7), feature dimension (16–64), batch size (64–256), and learning rate (0.001–0.02) were explored. All models were tuned using the same number of search iterations, and performance was evaluated using the validation subset within each cross-validation fold based on mean squared error. This consistent tuning strategy ensures that performance comparisons are not biased by differences in optimization effort. The final hyperparameter configurations were selected based on the best average validation performance across folds. The selected parameters for XGBoost and TabNet are reported in
Section 4.3. The validation errors corresponding to the selected configurations were consistent across folds, indicating stable hyperparameter selection.
5. Results and Discussion
This section presents a comprehensive evaluation of the proposed symmetry-aware hybrid learning framework, focusing on predictive performance, robustness, and interpretability. Beyond reporting quantitative results, the analysis aims to explain why the proposed framework outperforms existing approaches and how its design contributes to improved generalization and reliability.
The experimental results were obtained using the protocol described in
Section 4. Performance was evaluated using multiple regression metrics, including the coefficient of determination (R
2), root mean squared error (RMSE), and mean absolute error (MAE), ensuring a balanced assessment of both accuracy and error characteristics.
5.1. Predictive Performance and Statistical Evaluation
The comparative results obtained from 10-fold cross-validation are summarized in
Table 1.
Values are reported as mean ± standard deviation across 10-fold cross-validation.
The results clearly indicate that the proposed hybrid framework achieves the highest predictive performance among all evaluated models, with an R2 value of 0.9162 and consistently low error metrics. Specifically, the proposed model achieves an R2 of 0.9162, compared to 0.9087 for XGBoost and 0.8949 for TabNet, while reducing RMSE to 4.8271 and maintaining competitive MAE values across folds. While the numerical improvement over XGBoost (R2 = 0.9087) may appear moderate, it is important to emphasize that this gain is consistent across all cross-validation folds, indicating a systematic improvement rather than a dataset-specific fluctuation. In particular, the observed improvement corresponds to an approximate 0.8% increase in R2 and a reduction in RMSE by nearly 0.5%, which is consistent across all validation folds. The consistent improvement across all folds indicates that the proposed framework enhances structural generalization rather than exploiting dataset-specific patterns. It is also important to note that, despite the overall improvement in R2 and RMSE, the proposed hybrid framework does not consistently outperform the standalone XGBoost model in terms of MAE. This observation reflects an inherent trade-off introduced by the adaptive fusion mechanism. While the model is optimized to enhance global predictive alignment, the instance-dependent weighting strategy may lead to localized deviations in absolute error for certain samples. In particular, in regions where tree-based models provide more stable error minimization, the adaptive mechanism may assign relatively higher weights to the representation-learning component, resulting in slightly increased absolute errors. This behavior indicates that the proposed framework prioritizes overall structural consistency and generalization over strict minimization of pointwise absolute deviations.
From a statistical perspective, the improvement is further validated through the Wilcoxon signed-rank test (p < 0.05), confirming that the observed performance gains are statistically significant. This is particularly important in engineering applications, where robustness and repeatability are as critical as absolute accuracy.
5.2. Model Behavior and Robustness Analysis
The superior performance of the proposed framework can be attributed to the complementary integration of multiple learning mechanisms. From a quantitative perspective, the hybrid model consistently demonstrates lower variance (std = 0.0113) compared to TabNet (std = 0.0251), indicating improved stability across folds. To quantitatively verify the complementarity between TabNet and XGBoost, the Pearson correlation coefficient between their prediction residuals was computed on the test data. The analysis revealed a moderate correlation (r ≈ 0.42), indicating that the two models exhibit partially independent error patterns rather than highly overlapping prediction behavior. This level of correlation suggests that the models capture complementary aspects of the underlying data distribution, rather than redundant predictive patterns. This suggests that the models capture different aspects of the underlying data distribution, thereby supporting the effectiveness of their integration within the adaptive fusion framework. A lower correlation between residuals is generally considered an indicator of model complementarity in ensemble learning contexts. More specifically, the observed gain appears to stem from the capacity of the adaptive fusion mechanism to adjust learner dominance across heterogeneous input regions, allowing the framework to respond more effectively to local data structure than fixed or globally optimized ensemble formulations. TabNet captures complex nonlinear feature interactions through its attention-based architecture, enabling dynamic feature prioritization. In contrast, XGBoost provides strong performance in modeling structured relationships and handling feature interactions in tabular data. The integration of these models through an adaptive learning strategy allows the framework to leverage both representation learning and ensemble modeling capabilities. A closer examination of the error distribution further reveals that the adaptive fusion mechanism does not uniformly reduce absolute error across all samples. Instead, its effectiveness varies depending on the local structure of the feature space. This suggests that the model dynamically balances different learning behaviors, but may not always align with the objective of minimizing absolute error for every individual instance.
To further investigate the observed increase in MAE compared to the standalone XGBoost model, a subset-based error analysis was conducted. Specifically, the samples for which the hybrid model produced higher absolute errors than XGBoost were identified, and the corresponding adaptive weights were analyzed. The results indicate that, within this subset, the average weight assigned to the TabNet component is noticeably higher than its global average contribution (approximately 0.63 vs. 0.48 on average). This suggests that the adaptive fusion mechanism tends to favor the representation-learning model in regions where XGBoost provides more stable and accurate predictions in terms of absolute error minimization. From a mechanistic perspective, this behavior can be attributed to the objective of the adaptive weighting function, which is optimized for overall predictive alignment rather than explicitly minimizing instance-level absolute error. As a result, in certain regions of the feature space where tree-based models are inherently more reliable, the gating mechanism may still assign relatively higher weights to TabNet due to its ability to capture complex nonlinear interactions. This finding reveals an important limitation of the current framework: while the adaptive fusion strategy improves global performance metrics such as R2 and RMSE, it does not explicitly account for local error sensitivity. Consequently, the model may exhibit suboptimal weighting decisions for specific samples, leading to increased MAE in comparison to strong baseline models. These observations suggest that incorporating error-aware or uncertainty-guided weighting strategies could further improve the robustness of the adaptive fusion mechanism, particularly in terms of minimizing instance-level absolute deviations.
To further analyze the robustness of the proposed framework across different strength regimes, a quantile-based error analysis was conducted by partitioning the dataset into three groups according to compressive strength values: low-strength (<30 MPa), medium-strength (30–60 MPa), and high-strength (>60 MPa). The number of samples in each group was found to be unevenly distributed, with a relatively smaller proportion of samples in the high-strength regime, indicating a degree of data imbalance. The mean absolute error (MAE) of each model was computed separately for these quantile-based subsets. The results indicate that all models exhibit increased prediction error in the high-strength region, which can be attributed to the limited number of training samples and the higher variability associated with extreme mixture compositions. In comparison to XGBoost, the proposed hybrid model maintains comparable performance in the low- and medium-strength regions, while a slight increase in MAE is observed in the high-strength regime. Further analysis reveals that, in this region, the adaptive fusion mechanism tends to assign relatively higher weights to the TabNet component. While this behavior enhances flexibility in modeling nonlinear interactions, it may reduce the dominance of tree-based predictions that are more stable in extrapolation scenarios. These findings suggest that the adaptive fusion mechanism, although effective in capturing heterogeneous patterns across the feature space, may require additional regularization or uncertainty-aware weighting strategies to improve performance in sparsely represented high-strength regions. This observation highlights an important limitation of the current framework and provides a clear direction for future improvements.
To further examine the behavior of the adaptive fusion mechanism, the distribution of input-dependent weights was analyzed across the dataset. The distribution of the learned adaptive weights is illustrated in
Figure 2. The results indicate that the learned weights exhibit substantial variability across samples, confirming that the fusion mechanism does not degenerate into a fixed combination. Instead, the model dynamically adjusts the relative contribution of TabNet and XGBoost depending on the local characteristics of the input space. In particular, regions with stronger nonlinear feature interactions tend to favor the attention-based model, whereas more structured regions are dominated by the tree-based learner. This observation provides empirical evidence that the proposed adaptive weighting mechanism effectively captures input-dependent heterogeneity. This empirical evidence confirms that the proposed adaptive fusion mechanism does not degenerate into a static ensemble, but instead captures meaningful input-dependent variation across the feature space. Moreover, the absence of concentration around a single value further indicates that the learned weights do not collapse into a constant weighting scheme, reinforcing the effectiveness of the input-dependent fusion mechanism. This behavior further confirms that the proposed adaptive fusion mechanism effectively avoids the limitations of static ensemble strategies. In particular, the observed behavior suggests that the current weighting function does not explicitly account for extrapolation risk, which may lead to suboptimal model selection in regions with limited data support.
In contrast, transformer-based models (TabTransformer and FT-Transformer) exhibit significantly lower performance. This can be explained by the relatively limited size of the dataset. Transformer architectures generally require large-scale data to fully exploit their representational capacity. In small-to-medium-sized engineering datasets, these models tend to over-parameterize the problem, leading to reduced generalization performance. This observation highlights the importance of aligning model complexity with dataset characteristics.
Beyond predictive accuracy, model stability is a critical factor for practical deployment. The proposed framework demonstrates reduced variability across cross-validation folds, as reflected in lower standard deviation values compared to standalone models. From an error analysis perspective, this behavior highlights a limitation of the current framework. While adaptive fusion enhances global predictive consistency, it may introduce suboptimal weighting in specific regions, particularly where the underlying data distribution favors a single model. This finding suggests that further refinement of the weighting mechanism, potentially through constraint-aware or uncertainty-guided learning, could improve local error behavior without compromising overall performance.
This stability arises from two key design components. First, the stability-driven feature selection mechanism reduces redundancy and multicollinearity, resulting in a more consistent feature space across different data partitions. Second, the hybrid learning structure mitigates model-specific biases by integrating complementary predictive signals.
Unlike standalone models that may overfit to specific data patterns, the proposed framework maintains consistent performance across folds, indicating improved generalization capability. This property is particularly important in real-world engineering applications, where input distributions may vary over time.
5.3. Ablation Study and Component Analysis
To further investigate the contribution of each component, an ablation analysis was conducted. The results are summarized in
Table 2.
The ablation results demonstrate that each component contributes incrementally to the overall performance of the framework. For example, the transition from standalone XGBoost (R2 = 0.9087) to the full hybrid model (R2 = 0.9162) demonstrates a measurable cumulative gain achieved through the integration of all components. The stacking mechanism alone improves performance over standalone models, confirming the benefit of combining complementary learners.
To further isolate the individual contribution of each component, additional ablation experiments were conducted. Specifically, the effect of physics-aware feature engineering was evaluated by training XGBoost and TabNet using the extended feature set. The results indicate that incorporating physics-aware features leads to consistent performance improvements for both models, confirming the relevance of domain-informed feature construction. Similarly, the impact of the proposed stability-driven feature selection mechanism was assessed by applying it to XGBoost independently. The observed performance gain suggests that redundancy reduction and stability considerations contribute significantly to model robustness. Furthermore, a static weighted ensemble of TabNet and XGBoost was evaluated as a baseline hybrid strategy. Although this approach improves performance compared to standalone models, it remains inferior to the proposed adaptive fusion mechanism. This comparison highlights that the performance gain of the proposed framework does not arise solely from model combination, but from the input-dependent weighting strategy. Overall, these additional experiments demonstrate that each component contributes meaningfully to performance improvement, while the adaptive fusion mechanism provides the most significant enhancement by enabling context-aware model integration.
The introduction of stability-driven feature selection further enhances robustness by reducing redundancy and stabilizing the feature space. Finally, the inclusion of physics-aware feature engineering yields the best performance, indicating that embedding domain knowledge significantly improves predictive capability.
Importantly, the results show that performance gains are not dominated by a single component. Instead, they emerge from the synergistic interaction of all components. This confirms that the proposed framework should be interpreted as an integrated system rather than a simple combination of techniques.
5.4. Interpretability and Comparison with Literature
To enhance model transparency, SHapley Additive exPlanations (SHAP) analysis was employed to quantify feature contributions.
The global feature importance results (
Figure 3) indicate that curing age is the most influential variable, followed by cement content and water-related ratios. It is important to clarify that the final feature set used for model interpretation was determined after the stability-driven feature selection stage. Although derived features such as the water–cement ratio and water–binder ratio were initially constructed during the feature engineering phase, some of these features were not retained in the final model due to their high collinearity with the original input variables. This observation suggests that the proportional relationships represented by these derived features are implicitly captured by the learning algorithms through the original feature space. In particular, models such as XGBoost are capable of modeling nonlinear interactions and ratio-like relationships without explicitly requiring engineered ratio features. Therefore, the exclusion of certain derived features from the final SHAP analysis does not indicate their irrelevance, but rather reflects redundancy reduction and the ability of the model to internalize these relationships during training. This result further supports the robustness of the proposed feature selection mechanism, which prioritizes stability and non-redundant information over explicit feature construction. These findings are consistent with established principles in concrete technology, where curing duration and water–cement balance play a dominant role in strength development.
The SHAP summary plot (
Figure 4) further reveals the directional influence of features. Higher curing age values are associated with increased compressive strength, while higher water-related ratios negatively impact strength. This inverse relationship aligns with well-known physical behavior in concrete systems.
The local explanation (
Figure 5) demonstrates how individual feature contributions combine to produce a specific prediction. This level of interpretability is particularly valuable in engineering applications, where understanding model decisions is essential for practical adoption.
Unlike conventional black-box models, the proposed framework provides both global and local interpretability, enabling engineers to extract actionable insights from predictions.
To contextualize the results,
Table 3 compares the proposed framework with representative studies in the literature.
Although some studies report higher R2 values, these results are often obtained under controlled conditions, larger datasets, or domain-specific optimizations that may limit generalization. In contrast, the proposed framework achieves competitive performance on a widely used benchmark dataset characterized by heterogeneous mixture compositions.
More importantly, the contribution of this study extends beyond predictive accuracy. The proposed framework provides a balanced integration of accuracy, robustness, and interpretability. By incorporating symmetry-aware representation, stability-driven feature selection, and hybrid learning, the model achieves a level of structural consistency that is often lacking in purely data-driven approaches.
From an engineering perspective, the proposed framework can be interpreted as an advanced soft-sensing model for concrete compressive strength prediction. By leveraging readily available mixture parameters, the model enables rapid and reliable estimation of strength without the need for time-consuming laboratory testing.
The integration of explainability further enhances its practical value, allowing practitioners to understand the influence of key variables and make informed decisions in mixture design and optimization.
Overall, the results demonstrate that the proposed framework:
Achieves superior predictive performance with statistically significant improvements;
Provides enhanced robustness through stability-driven feature selection;
Ensures interpretability through explainable AI techniques;
Maintains consistency with established physical principles.
These findings confirm that the proposed approach is not only effective in terms of prediction accuracy but also reliable and applicable in real-world engineering contexts.
6. Conclusions
This study presents an adaptive symmetry-aware hybrid learning framework (ASAPH) for predicting concrete compressive strength by integrating physics-aware feature engineering, stability-driven and symmetry-preserving feature selection, and input-dependent model fusion. The proposed framework was designed to address key limitations of existing approaches, including limited structural consistency in feature representation, reduced robustness in feature selection, and the rigidity of static ensemble strategies.
The experimental results show that the proposed framework achieves improved overall performance compared with strong baseline models, including XGBoost and TabNet. Although the numerical gain is moderate in absolute terms, it is consistent across cross-validation folds and supported by statistical significance analysis, indicating improved generalization rather than dataset-specific improvement.
The results further indicate that symmetry-aware representation and stability-driven feature selection contribute to a more robust and physically consistent feature space, while the adaptive fusion mechanism enables more flexible integration of complementary learning behaviors. In this sense, the main contribution of the study lies at the framework level rather than in any single model component. This holistic design distinguishes the proposed approach from conventional hybrid models that rely on static or loosely coupled integration strategies.
Interpretability analysis further shows that curing age, cement content, and water-related parameters are the dominant factors affecting compressive strength, which is consistent with established engineering knowledge and supports the physical plausibility of the proposed framework.
Despite its advantages, the proposed framework has several limitations. First, the evaluation is conducted on a single benchmark dataset, and its generalizability to larger-scale or real-world industrial settings remains to be validated. Second, the adaptive fusion mechanism introduces additional model complexity, which may increase computational cost in resource-constrained environments. In addition, the proposed framework does not consistently achieve a lower mean absolute error compared to strong baseline models such as XGBoost, indicating that while the adaptive fusion mechanism improves overall predictive performance, it may not always ensure optimal instance-level error minimization. Addressing this limitation represents an important direction for future research, particularly in the development of more robust and context-aware weighting strategies.
Future work may focus on evaluating the framework on larger and more diverse real-world datasets, incorporating uncertainty-aware learning, and developing more robust adaptive weighting strategies that explicitly account for local error behavior. In addition, the proposed framework is suitable for deployment as a practical decision-support tool. Owing to its reliance on structured mixture parameters and its modular design, it can be integrated into user-friendly software environments such as web-based applications. This would enable practitioners to obtain rapid strength predictions along with interpretable insights, facilitating more efficient and informed mix design decisions without extensive experimental effort. These findings demonstrate that integrating physics-informed representation with adaptive model fusion provides a robust and interpretable alternative to conventional black-box approaches in engineering prediction tasks.