Skip to Content
MetalsMetals
  • Article
  • Open Access

25 January 2026

Parallel Hybrid Modeling of Al–Mg–Si Tensile Properties Using Density-Based Weighting

,
and
1
Department of Mechanical and Industrial Engineering, NTNU—Norwegian University of Science and Technology, NO-7491 Trondheim, Norway
2
Hydro Aluminium, Research and Technology Development (RTD), NO-6601 Sunndalsøra, Norway
*
Author to whom correspondence should be addressed.

Abstract

A hybrid modeling framework for predicting the mechanical properties of Al-Mg-Si alloys, that blends physics-based and machine-learning models, is developed and tested. Motivated by a demand for post-consumer material (PCM) content in wrought aluminium applications, this work proposes, analyses, and discusses a parallel framework that applies an adaptive weighting coefficient derived from local observation density. Based on existing datasets from a range of Al-Mg-Si alloys, such a model is trained and tested in an iterative manner to study its robustness, by emulating a shift in observed alloy composition. The results indicate that the hybrid model is able to combine the interpolative strength of machine learning for cases similar to previous observations with the explorative strength of physics-based (Kampmann–Wagner Numerical) modeling for previously unobserved parameter combinations, as the hybrid model shows higher or similar accuracy than the best of its constituents across the majority of the sequence. The observed model characteristics are promising for predicting the effect of increased compositional variation inherent in PCM. Finally, possible future research is discussed.

1. Introduction

The transition towards Industry 4.0 and the emerging Industry 5.0 paradigms are among the driving forces of continued digitalization of industrial processes worldwide [1]. This leads to intensive data acquisition and increased application of data-driven methodologies. At the same time, in manufacturing sectors such as the aluminium industry, a substantial challenge related to carbon footprint reduction is the integration of post-consumer material into alloys such as high-grade wrought aluminium alloys [2]. Can the ongoing twin transition, through its emphasis on digitalization, remedy this issue by increasing the tolerance for variability and uncertainty in material properties?
Motivated by this challenge, a natural question is whether hybrid modeling can leverage the complementary strengths of physics-based modeling (PBM) and data-driven modeling (DDM) to handle such variability: PBM offer mechanistic, extrapolative robustness beyond observed data, whereas DDM delivers high interpolative accuracy where data are dense. In such, this study investigates a parallel hybrid for Al–Mg–Si property prediction, in which PBM and DDM outputs are weighted by a coefficient that adapts to local data support. It is evaluated how this architecture behaves as data accumulate in newly sampled regions, focusing on the trade-off between robustness under distribution shift and accuracy within observed regimes. The research question was formulated as follows:
How can physics-based and data-driven models be effectively blended for Al–Mg–Si mechanical property prediction, and what are the associated possibilities, limitations, and challenges in achieving extrapolative robustness alongside interpolative accuracy?
The objective is to instantiate and evaluate such a hybrid on an Al–Mg–Si dataset, assessing its behavior during incremental data growth that emulates a change in alloy composition, and to establish a baseline for more advanced hybridization strategies in future research.

1.1. Hybrid Modeling

Hybrid modeling represents a structured methodology for combining PBM and DDM to leverage their complementary strengths [3]. While PBM provides mechanistic understanding and extrapolative reliability without the need for prior observations, they also sometimes suffer from simplified assumptions and incomplete parameterization, which can limit accuracy in complex systems [4]. Conversely, DDM typically excels at capturing complex empirical relationships from observed data but may have limited predictive power outside the domain covered by available training data [5]. Hybrid models have therefore emerged as a powerful approach to address these respective shortcomings and provide enhanced accuracy, broader applicability, and improved robustness [6]. They can also be expected to capture secondary effects and practical nuances encountered in industrial operations, such as equipment-specific characteristics, maintenance routines, operational styles, and inherent process variability. According to the classification by Glassey & von Stosch (2018) [5], hybrid architectures typically fall into three main categories, which are summarized below and visualized in Figure 1.
Figure 1. Hybrid model categorization with (a) parallel, (b) serial/residual, and (c) integrated models.
(a)
Parallel hybrid architectures, where both models independently generate predictions, and the outputs are subsequently combined. The blending of PBM and DDM outputs usually depends on explicit weighting mechanisms, which may be determined by local data density, model uncertainty, or process knowledge.
(b)
Serial or residual hybrid architectures where a data-driven model learns to correct the errors of a physics-based model, i.e., using DDM as a corrective step.
(c)
Integrated (physics-informed) machine-learning models, where physical knowledge or constraints directly shape the machine-learning model, typically via modified loss functions, regularization terms, or physically constrained neural networks.
Recent literature offers structured frameworks for hybrid modeling. Von Stosch et al. established foundational distinctions between parallel and serial hybrids in process systems engineering, highlighting their respective appropriateness [6]. Zendehboudi et al. [7] and Glassey & von Stosch [5] later reviewed hybrid modeling strategies across chemical, petroleum, and process industries, emphasizing the benefits of combining physics with machine learning (ML) for improved generalization and interpretability. Rai and Sahu [8] provided a comprehensive review of hybrid modeling approaches combining physics-based and machine-learning models including innovative network architectures, physics-based pre-processing and regularization, yet their work did not explicitly explore parallel hybrid architectures employing adaptive weighting based on local data density. Gargalo et al. recently further explored hybrid modeling in the context of Industry 4.0 and 5.0, identifying practical integration challenges and trends [9]. These works provide a strong foundation for hybrid modeling as a methodological approach.
Studies referring to results from use case implementations provide further evaluations of the architectural variants in Figure 1. Rudolph et al. (2024) formalize the parallel category for cases where a first-principles model captures the dominant physics and trends, and a data-driven term can be added to correct discrepancies and unmodeled phenomena, improving accuracy and robustness while keeping the physical component interpretable [4]. Sharma and Liu (2022) show, through case studies, that serial hybrids where an ML model learns time-dependent residuals yield higher prediction accuracy and better extrapolation than standalone ML in process-development settings [10]. Similarly, Shah et al. (2025) report from a case study that a serial model applying a time series transformer is able to cancel model mismatch [11]. Complementing these works, Claes et al. (2023) state that parallel hybrid models prove more accurate than purely data-driven models while enabling separate assessment of how the physical and data-driven components contribute to performance and parameter identification [12].
Based on the reviewed literature, we can justify the parallel architecture when the following occurs:
(A)
The physics model captures core trends and is globally unbiased, but omits certain nonlinearities or contextual effects;
(B)
The data-driven model excels in data-rich regions but may not generalize well outside the domain represented in training data;
(C)
The hybrid model should represent a combination of these strengths while providing interpretability and extrapolation safety.
While a parallel blend is advantageous under the conditions above, serial and integrated hybrids can be preferable under other circumstances. A serial hybrid might suit cases where the physics model is trend-correct but systematically biased, or where discrepancies depend on physics states and are best corrected using those states as features. An integrated hybrid is appropriate when hard constraints (e.g., conservation, monotonicity, physical bounds) must hold globally, when latent parameters or mechanistic variables are of primary interest, or when uncertainty must be propagated coherently through a single model. These alternatives are therefore complementary rather than competing.

1.2. Physics-Based Al-Mg-Si Strength Modeling (PBM)

The strength and mechanical properties of an Al-Mg-Si alloy are mainly due to precipitates formed during artificial aging. The precipitation sequence is commonly idealized as shown below and mainly depends on the initial available concentration of Mg and Si in solid solution [13]:
SSSS atomic clusters GP-zones β β , U1 , U2 , B β .
Here, SSSS denotes supersaturated solid solution, β , β , U1, U2, and B′ are different metastable phases, and  β (Mg2Si) is the equilibrium phase. The maximum strength is usually obtained for an optimum combination of particle number density and mean particle size of the hardening β phase [13,14], which is obtained through a T6 artificial aging heat treatment. For prolonged aging beyond the time corresponding to the T6 temper condition, the particles grow, the number density decreases, and the metastable particles transform further, leading to over-aging and reduced strength [15,16]. In simplified terms, the alloy’s mechanical properties are typically understood as a result of its composition and complete thermo-mechanical history.
To accurately predict precipitation in Al–Mg–Si alloys, population-balance models based on the Kampmann–Wagner Numerical (KWN) approach are typically employed. These physically based models discretize the precipitate size distribution into classes, numerically tracking concurrent nucleation, diffusion-controlled growth, and coarsening (Ostwald ripening) of precipitates during aging [17]. Given accurate input parameters (e.g., interfacial energies, solute diffusivities, and parameters describing the nucleation rate), KWN simulations can replicate experimentally observed particle size distributions and capture the microstructural evolution during heat treatments [18].
A well-established KWN-based methodology for Al–Mg–Si alloys is the Nanostructure Model (NaMo), originally developed by Myhr and Grong [18]. NaMo models precipitation using two coupled particle populations: (i) atomic-scale clusters formed during natural aging, and (ii) metastable precipitates (mainly β and β ) nucleated during subsequent artificial aging. The two distributions evolve with separate nucleation and growth laws but are linked by solute conservation (a continuity constraint) that partitions the available solute between clusters and precipitates [19]. This framework inherently captures the competitive relationship between clusters and strengthening precipitates, for example, reproducing the observed delay in β formation due to prior natural aging (clusters temporarily tie up solute and retard β nucleation) [20]. Furthermore, NaMo includes a strength modeling framework that predicts the yield strength ( R p 0.2 ), as well as work hardening curve of the studied alloy-aging combination such that the ultimate tensile strengths ( R m ) can be extracted [19]. Comparisons with experimental data have estimated the accuracy to about ±10% of observed values [18].
The physics-based foundation provided by NaMo thus ensures a reliable theoretical baseline upon which further hybrid modeling developments can be built, as discussed in subsequent sections.

1.3. Data-Driven Modeling of Aluminium Alloy Mechanical Properties (DDM)

In contrast to the above described PBM approach, the mechanical properties can also be estimated using DDM. Based on available previous observations as training data, the relationships between mechanical properties as output and composition plus thermo-mechanical history as input can be learned as a non-linear multivariate function using ML. Several studies reporting on such implementations, for both various aluminium alloys and other alloy systems, have been published over the recent years [21]. Also the present work can be seen as an extension of such studies reporting on excellent interpolative accuracy of ML compared to general predictions using PBM in the case of industrial applications [22]. Generally, such DDM applications are realized by advanced regression based on a chosen ML algorithm, for which a multitude of alternatives exist [23].

1.4. Hybrid Modeling of Aluminium Alloy Mechanical Properties

Hybrid modeling for the prediction of aluminium alloy mechanical properties has limited coverage in the literature, especially in the most recent years. Abbod et al. (2002) used plane strain compression (PSC) tests to validate hybrid prediction of recrystallization time and grain size during hot deformation of a set of alloys [24]. It laid the groundwork for modular hybrid architectures but did not explore adaptive weighting or output blending. Both Zhu et al. and Sellars et al. (2003) introduced a hybrid framework combining physically based dislocation and recrystallization models with a neuro-fuzzy inference system to predict flow stress and microstructural evolution (e.g., subgrain size, dislocation density) under hot deformation of Al-Mg alloys [25,26]. Zhu et al. validated the work using finite element method (FEM) simulations.
In 2006, Abbod et al. further built on the work of Zhu et. al., developing a modular hybrid model with a clear separation of tasks: a neuro-fuzzy system modeled internal states (e.g., dislocation density, subgrain misorientation), and a physical model used these states to compute stress and recrystallization variables [27]. The model demonstrated good generalization to unseen data and various alloy compositions. Wang et al. (2013) proposed a combined model using a simplified thermo-mechanical formulation together with data-driven components for predicting tensile strength in 7449 aluminium alloy friction stir welded joints [28]. While limited in scope, it demonstrated hybrid modeling for downstream property prediction post-processing by training an ANN to predict boundary condition values.
Despite the above-mentioned advances, none of the works implemented or rigorously explored a parallel hybrid architecture with adaptive weighting based on observation density. Most studies either use residual or sequential hybrid models, where the ML model corrects or feeds into the physical model, or apply static weights without adapting to local data availability.

2. Materials and Methods

In this work, a parallel hybrid architecture (Figure 1a) is applied for model blending, where one physics-based and one data-driven model are inferred independently and their predictions are combined by an adaptive weighting coefficient c. A novel, density-based computation method for c is applied, presented in detail in Section 2.1. This enables model blending upon inference according to how the actual inference point relates to the body of training data.
In the present study, a parallel architecture was the natural choice due to the conditions (A)–(C) of Section 1.1 all being met in the Al-Mg-Si application presented in Section 2.2. A serial design might achieve similar accuracy if the physics-model discrepancy is structured and correctable; however, our aim is to retain the stand-alone semantics of both PBM and ML predictions and to expose a data-density-based coefficient that is auditable and updateable without retraining, which motivates the parallel choice.

2.1. Adaptive Parallel Hybrid Modeling Framework

A visualization of the specific parallel framework used in this work is shown in Figure 2. To blend the two models, a coefficient model is trained to represent the observation density in the training data X by a coefficient c [ 0 , 1 ] . Given a new inference point x, this model will output the weighting coefficient value that should be applied for that specific point; c 1 if x is very close to several points in the training data and c 0 in the opposite case.
Figure 2. Parallel hybrid modeling framework with weighting based on observation density.
The coefficient model is based on robustly scaling the training data (using each dimension’s median and interquartile range) and using it to fit a k-nearest neighbor (k-NN) model. A min-max mapping r of the k-NN distance d ( x ) to a given inference point is then computed based on close and far distances sampled from an expanded inference box. Specifically, it was chosen to do this normalization in the log domain with robust anchors, followed by a power-law contrast sharpening, i.e.,
r ( x ) = 1 , d ( x ) d close , log d far log d ( x ) log d far log d close a , d close < d ( x ) < d far , 0 , d ( x ) d far .
A dataset size scaling factor g is also defined to avoid strong weighting of ML predictions based on a low number of training datapoints N. Since dataset sizes typically vary across several orders of magnitude a log-linear scaling was chosen, or expressed mathematically
g = 0 , if N N 1 , log N log N 1 log N 2 log N 1 , if N 1 < N < N 2 , 1 , if N N 2 ,
such that g ( N N 1 = 0 ) and g ( N N 2 = 1 ) , with a log-linear response in between. Finally, the coefficient is then computed as
c ( x ) = g · r ( x ) ,
which indicates how well the point is represented by the training data. Hyperparameters in this model are the number of neighbors k, the sharpness a, and the dataset size thresholds N 1 and N 2 , which will be discussed in Section 2.2. The implementation of this method is described in detail in Algorithm 1 below.
Three core design choices in Algorithm 1 are meant to ensure hybrid model stability:
(i)
Robust scaling (median/IQR per feature) tempers outliers and mixed units across input variables;
(ii)
Local distance calibration using data-driven d close and d far produces a smooth, monotone mapping d c that is insensitive to absolute scales;
(iii)
Dataset size scaling attenuates c for small N and relaxes this cap as coverage grows.
Algorithm 1 Coefficient Computation for Hybrid Model Blending
Require: Dataset D with N rows; new datapoint x;
number of neighbors k, sharpness a, dataset size thresholds N 1 and N 2 , number of samples S for robust scaling, and scaling factor R for inference box widening
Ensure: Coefficient c ( x ) [ 0 , 1 ]
 
 Build model 
1: center median ( D )  
2: scale IQR ( D ) ▹ interquartile range
3: Z ( D center ) / scale ▹ Robust centering and scaling
4:Fit k-nearest neighbors model on Z 
5: range max ( D ) min ( D )  
6: infer _ min min ( D ) R × range ▹ Define lower inference limit
7: infer _ max max ( D ) + R × range ▹ Define upper inference limit
8:Uniformly sample S points within the inference box 
9:for each sample s do 
10:     z s ( s center ) / ( scale )  
11:    Compute mean sample distance d s to k nearest neighbors in Z 
12:end for 
13: d close min ( d s )  
14: d far median ( d s ) ▹ Range for distance to coefficient mapping
15: g ( log N log N 1 ) / ( log N 2 log N 1 )  
16: g clip ( g , 0 , 1 ) ▹ Clipped dataset size scale factor
   
 Infer model 
17: z ( x center ) / scale  
18:Compute mean distance d to k nearest neighbors in Z 
19:d, d close , d far log ( d ) , log ( d close ) , log ( d far )  
20: c ( d far d ) / ( d far d close )  
21: c clip c , 0 , 1 ▹ Clipped distance mapping
22: c c a ▹ Sharpening
23: c c × g ▹ Scale with dataset size
 return c 

2.2. Implementation for Al-Mg-Si Mechanical Properties

The presented framework has been implemented and tested on Al-Mg-Si mechanical properties, based on a specific choice of PBM and DDM; NaMo (version 3.11 2025a) was applied as PBM and an XGBoost [29] regression model was trained and used as the DDM counterpart using Python. NaMo has the benefit of including both precipitation and strength modeling which serves the purpose well for this study. An alternative, that would require an additional effort in estimating tensile properties based on precipitate number density and size distributions, could be the open-source Kawin model [30]. For DDM, on the other hand, available methods are abundant. Since XGBoost is specifically meant for learning a multivariate relationship from structured (tabular) data, and has demonstrated strong performance across numerous multivariate regression applications [31], it was selected as the data-driven modeling approach for this work. The setup and hyperparameters for the model was taken from previous work [32] and have been summarized in Table 1.
Table 1. Applied XGBoost hyperparameters.
After implementation, these two models were used to make parallel predictions of measured yield stress R p 0.2 and ultimate tensile stress R m of extruded profiles based on a numerical representation of thermomechanical history and chemical composition. In essence, the time from when the profiles exit the extrusion tool until tensile testing is discretized and at each time step the temperature and any applied plastic strain is designated as features. This defines a multi-linear representation. Chemical composition is represented by measured weight percentages of the elements Mg, Si, Fe, Mn, Cu, and Cr, resulting in a total of 16 features. The applied dataset is described in Section 2.2.1 and was used both to run NaMo simulations (row by row) via a software development kit (SDK) for Python 3.10, and XGBoost 1.7.1 model training, also using Python.
While XGBoost is well-suited for this dimensionality, it was chosen to reduce the input dimensionality for the coefficient calculation to four features—the concentration of the two most important alloying elements, Mg and Si, aging temperature T, and a synthetic feature I called the Scheil integral to quantify artificial aging [32]. This dimensionality reduction ensures simplicity, interpretability, and limits computational cost of the k-NN distance scaling described in Algorithm 1 that scales exponentially with the number of variables. The four described inputs are expected to represent input similarity in a sufficiently accurate manner based on the results in recent research by the authors on the use of I as a synthetic feature for representing precipitation effect of artificial aging cycles [32]. For this particular application, this method for dimensionality reduction was evaluated as more appropriate than a general numerical method such as principal component analysis as it explicitly defines the (Mg, Si, T, I) domain of trust. The input dimensionality m to each model is reflected as ( N , m ) in Figure 2, whereas m = 16 for PBM and DDM and m = 4 for the coefficient model.

2.2.1. Dataset and Extrusion Experiments

For model training and testing, a database of previous experimental results was used. This dataset was also applied in previous work related to validating I as a synthetic feature [32]. The dataset stems from a total of 2895 extrusion experiments shared by the company Hydro Aluminium. These experiments used Ø95 mm extrusion billets that were direct chill casted at the R&D center of Hydro Sunndalsøra, cut to 200 mm length, in 46 systematically selected Al-Mg-Si alloy compositions covering Mg and Si content in the range 0.2–1.0 wt. %, with Mg/Si content ratios in the range 0.5–2.2 (corresponding to each point in Figure 3), and also with variation in Fe, Mn, Cu, and Cr content. The chemical composition of each charge was measured using X-ray fluorescence spectrometry. Strip profiles were extruded in a vertical press at the Department of Mechanical and Industrial Engineering at NTNU Trondheim with a ram speed of the experiments was 30 mm/min and initial billet temperature of 500 °C. Water quenching was applied and the subsequent room temperature storage duration was controlled. Artificial aging was then carried out with a total of 124 different cycles, each consisting of linear segments of constant temperature or heating/cooling rates. Among the experiments aging temperatures range from 165 to 210 °C and durations range from two to hundreds of hours. A tabular representation of this dataset resulted in 16 features consisting of element concentrations, aging temperatures and durations.
Figure 3. Evolution of the coefficient field (left to right) in the numerical experiment for two temperature–Scheil integral combinations: T = 170 °C, I = 0.10 (upper row, moderately observed) and T = 185 °C, I = 0.83 (lower row, frequently observed). The dotted lines denote the 6060-alloy boundary, and points are test set alloy compositions where ’x’ markers denote 6060-like alloys.

2.3. Numerical Experiment with Alloy Introduction

To answer the initially stated research question, a scenario of distribution shift was constructed from the dataset, i.e., an abrupt change in observed chemistry. First, the dataset was shuffled and split into training, validation, and testing sets by 80/10/10. Then, in the training set, the compositions falling into a 6060-category with Mg < 0.6 wt.% and Si < 0.6 wt.% was moved into the second half of the set, as visualized in Figure 4a. By training the XGBoost model on incrementally larger portions of this re-sorted training set and testing on the (unchanged) test set, interpolation and extrapolation performance was tracked over increasing training size (N), and the coefficient model was retrained per iteration to quantify how it adapts to the sudden appearance of the new alloy class. Thus also the hybrid modeling strategy can be compared to the PBM and DDM alternatives throughout the scenario and robustness can be evaluated.
Figure 4. Evolution through the dataset (increasing N) of (a) R p 0.2 , (b) model performance, and (c) average weighting coefficient.
To set the values of the hyperparameters k, a, N 1 , and N 2 in the coefficient model, a grid search was conducted based on the range of values given in Table 2. By iteratively training the ML and coefficient models as described, the average accuracy in terms of root-mean-squared error (RMSE) of predictions on the validation set was calculated. The hyperparameter combination yielding the highest average validation set accuracy was chosen, namely k = 8 , a = 1 , N 1 = 200 , and N 2 = 2000 .
Table 2. Hyperparameter grid used for the coefficient model. The found optimal combination is highlighted in bold typeface.
Furthermore, the parameters S and R related to robust distance scaling and inference box widening, respectively, were set to S = 10 4 and R = 0.25 . The effects of the hyperparameter values on the coefficient model are discussed in Section 4.

2.4. Robustness Study on the Proposed Framework

In order to assess the design of the proposed adaptive hybrid framework, two comparative analyses designated (a) and (b) were conducted.
(a)
Firstly, the importance of the distance adaptivity enabled by the coefficient model was studied, by optimizing a static weight on the validation set and computing the resulting hybrid predictions. Thus the PBM and DDM estimates are combined in a statistical weighted average, to evaluate the effect of the adaptivity that is enabled by the distance-based coefficient c. The chosen value of c for comparison was done by evaluating the average validation accuracy resulting from any choice of c { 0.00 , 0.01 , , 1.00 } based on the same iterative study as described in Section 2.3, and selecting the lowest RMSE error.
(b)
Secondly, the effect of the scaling factor g was studied by letting g = 1 statically, so that the computed weighting coefficient c is not affected by the amount of training data used. Otherwise the analysis was done similarly to the described iterative study. The results from these analyses are presented and discussed in Section 4.

3. Results

3.1. Model Blending Under Distribution Shift

Introducing a previously under-represented 6060-like chemistry (Mg < 0.6 wt.% and Si < 0.6 wt.%) reshapes the observation-density weighting field c as training size N grows, as shown in Figure 3. Early in the sequence, the 6060 region shows low coefficients (trust in PBM dominates), while as progressively more 6060 samples enter the training set, c rises in that subdomain, transferring weight toward the ML component.
The transition of c is spatially selective: for fixed ( T , I ) slices, the largest coefficient gain occurs near the highest sampled Mg–Si compositions, while regions already dense remain largely unchanged. This behavior matches the design of Algorithm 1 (robust centering/IQR scaling, close/far distance calibration, and dataset size scaling) and yields a weighting coefficient c [ 0 , 1 ] that adapts to local data support.

3.2. Accuracy Evolution

Across the incremental training experiment, the following patterns can be seen in Figure 4:
(i)
NaMo provides constant predictions throughout, with error level independent of N, reflecting its extrapolative robustness.
(ii)
The XGBoost model starts disadvantaged in the 6060 regime (as a result of poor extrapolation), but improves markedly as 6060 observations accumulate. Outside this region, the performance is stronger from the outset and also increases with N. The accuracy of 6060 and non-6060 test predictions surpass that of NaMo at N 1400 and N 800 respectively.
(iii)
For most of the sequence, the hybrid is more accurate in both regions. Initially it leans on NaMo (low c), while as N increases, the average weighting coefficient rises. Notably, it rises slower in the 6060 regime compared to non-6060 until those observations are introduced, shifting weight toward ML and reducing error toward the ML frontier. At N 1800 , the ML model show similar or marginally higher accuracy than the hybrid.
Collectively, Figure 4b,c showcase an accuracy–robustness trajectory that combines robustness during distribution shift with interpolative capability. A quantification of the results is summarized in Table 3, indicating both R p 0.2 and R m accuracy trajectories. A similar model behavior was observed for the two outputs.
Table 3. Test set RMSE [MPa] for NaMo, XGBoost, and Hybrid across training sizes N in 6060 and non-6060 regimes for yield strength R p 0.2 and ultimate tensile strength R m . The highest accuracy (lowest RMSE) for each N is bolded.

4. Discussion

The numerical experiment reflects a realistic industrial scenario involving distribution shift—in this case, altered precipitation kinetics and strength contributions resulting from the introduction of post-consumer scrap and broader compositional tolerances. Under such conditions, pure ML initially suffers from limited training data, and its predictive accuracy declines rapidly outside the domain of previously observed alloys. In contrast, the physics-based NaMo model maintains predictive robustness due to its mechanistic foundation. The parallel hybrid model leverages the strengths of both approaches: it incorporates fine-grained empirical corrections from ML where data are available, while remaining anchored by the extrapolative reliability of the PBM in under-sampled regions. Figure 3 illustrates this adaptive weighting across representative slices in the (Mg, Si, T, I) input space.

4.1. Coefficient Model Characteristics

Beyond the conceptual split between interpolative and extrapolative regimes, the method of computing c according to Algorithm 1 induces the following robustness properties that are observable in Figure 3 and Figure 4:
(i)
Monotone, locality-driven blending is achieved since c is a smooth, distance-based function with robust scaling; when local density rises, c rises locally. In Figure 3 this appears as growth confined to the newly sampled ( Mg , Si , T , I ) region rather than a global shift.
(ii)
Early-phase risk control with gradual release ensures that for small N, the hybrid defaults to the PBM floor, while as N grows, this cap is relaxed. Coefficient model hyperparameter validation ensures comparable performance to the peak ML accuracy at the end of the sequence in Figure 4b.
(iii)
Operational, testable behavior is implied: c changes smoothly with local density, single samples have bounded effect, and weight shifts remain evident to local patterns. Such characteristics may be used for deployment diagnostics.
Together, these properties tie the observed stability to the design of c, clarifying why and how the method remains conservative under dense coverage while maintaining robustness during data-sparse transitions.
The optimal hyperparameter values k = 8 , a = 1 , N 1 = 200 , and N 2 = 2000 resulting from the grid search defined in Section 2.3 can be seen as a result of the characteristics of the applied dataset, PBM and DDM. Referring to Figure 3, the hyperparameter combination gives a comparatively liberal blending toward the ML component. In practice, k = 8 provides local sharpness near observations while a = 1 preserves a relatively slow decay, allowing intermediate values of c even in partially observed regions. This setting aligns with the empirical finding that ML offers strong gains once moderate data are available, but it also reduces conservatism in sparser parts of the feature space. In future applications, a power law sharpness a > 1 should be considered to ensure trust in PBM in unobserved regions. Finally, the observed optimal dataset size scaling prevents ML influence for N < 200 and allows full trust for N > 2000 .
The parameters S defining the number of samples for robust distance scaling and R defining the inference box width were not optimized in this study. In broad terms, S should be sufficient to ensure a deterministic and reasonable normalization and R defines the plotting boundary for Figure 3.

4.2. Effect of Coefficient Adaptivity

As described in Section 2.4, a comparative analysis was done to investigate the effect of (a) coefficient adaptivity and (b) the scaling factor g. This was done by setting c = 0.44 statically based on validation set optimization and (b) the scaling factor g based on training set size and parameters N 1 and N 2 by letting g = 1 . The results are visualized in Figure 5. They indicate an advantage of both the adaptivity and the training set size scaling, as a constant weight prevents adherence to ML-level performance at the end of the sequence, and g = 1 reduces accuracy in the first half of the sequence except for the first iteration.
Figure 5. Comparison of the overall model performances with (a) static validation-optimized coefficient c = 0.44 and (b) without training-size scaling, i.e., g = 1 .

4.3. Implications for Increased Scrap Tolerance

In aluminium extrusion value chains that integrate post-consumer fractions, chemistry variability will produce repeated distribution shifts. The presented architecture can be applied without re-training the ML model since the coefficient field will be recomputed from the latest training dataset to immediately adapt blending. This could enable: (i) faster onboarding of new chemistries (start PBM-leaning), (ii) progressive tightening of prediction intervals as density grows, and (iii) transparent interpretability—operators can visualize c next to the ( Mg , Si , T , I ) context to see why the model trusted PBM or ML for a given job ticket. Even though the model’s robustness does not rely on frequent ML model retraining and is well applicable without it, the type of ML model that is used in such cases can be retrained in seconds and require little compute or energy compared with larger models.

4.4. Limitations and Next Steps

The presented method uses density as a proxy for uncertainty, and not uncertainty itself. In some cases, however, equal k-NN densities may mask different error landscapes which could again lead to falsely trusting one of the constituting models. Secondly, since the coefficient model uses a reduced four-feature space, adaptive feature weighting or adding a learned metric could significantly improve c if downstream errors concentrate along other axes (e.g., Fe, Cu, or deformation history). This might make it necessary to improve on the presented blending method. Furthermore, several alternative frameworks could be benchmarked to test whether the observed lower-envelope behavior persists across architectures, for instance ensemble ML methods, serial/residual hybrid architectures (Figure 1b), integrated PBM-DDM architectures (Figure 1c) such as PINN connected with NaMo’s mechanistic outputs (e.g., precipitate state variables) or with other KWN models. Finally, a natural next step would be to test the hybrid modeling framework in a case study on extrusions with high levels of impurities, increased concentrations of other elements such as Fe, Cu, or Mn, or alloys with actual post-consumer content.

5. Conclusions

A parallel hybrid modeling framework for predicting Al–Mg–Si tensile properties ( R p 0.2 and R m ) has been demonstrated by blending a physics-based Kampmann–Wagner/nanostructure-model (NaMo) prediction with an XGBoost regressor using an adaptive, density-based weighting coefficient. The numerical experiment emulating an abrupt chemistry shift (introduction of a 6060-like regime) shows that the coefficient field evolves locally with increasing observation density, regions that are initially under-sampled remain PBM-dominated, while newly populated subdomains transition toward ML weighting as training coverage grows.
Across the incremental training sequence, the hybrid model exhibits an accuracy–robustness trajectory that captures the practical complementarity of its constituents. NaMo provides a stable, extrapolative baseline independent of training data, whereas XGBoost initially underperforms in the under-represented 6060 regime but improves markedly once relevant observations accumulate. The hybrid combines these behaviors: it defaults toward the physics-based floor in data-sparse regimes and progressively admits ML gains as density increases, yielding higher or comparable accuracy than the best constituent across most of the sequence in both the 6060 and non-6060 regimes. Comparative tests further show that both density adaptivity and training-size scaling are contributing to performance, in that a static blending coefficient fails to reach ML-level accuracy late in the sequence, and removing dataset size scaling degrades early-stage performance.
From an industrial perspective, these characteristics imply tangible value under repeated distribution shifts associated with broader compositional tolerances and post-consumer material integration. The proposed architecture supports conservative early predictions when a chemistry is new (PBM-leaning), followed by an auditable transition toward data-driven accuracy as observations accrue, and it enables transparent diagnostics by visualizing the coefficient c in the (Mg, Si, T, I) context for a given condition.
The framework’s main disadvantages and constraints follow directly from its design assumptions. Firstly, the method uses local data density as a proxy for ML model accuracy, which may not reflect heterogeneous error landscapes, and secondly, the coefficient model operates in a reduced four-feature space, which can miss directions along which predictive errors concentrate.

Author Contributions

Conceptualization, C.D.Ø. and O.R.M.; Methodology, C.D.Ø. and O.R.M.; Software, C.D.Ø. and O.R.M.; Validation, C.D.Ø. and O.R.M.; Formal analysis, C.D.Ø.; Investigation, C.D.Ø.; Writing—original draft preparation, C.D.Ø.; Writing—review and editing, O.R.M. and G.R.; Supervision, G.R.; Funding acquisition, G.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Norwegian Research Council through the AluGreen project (GA: 328831).

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to proprietary restrictions imposed by the owner.

Acknowledgments

The NaMo model executable and its Python SDK were provided by Hydro Aluminium, as well as the dataset used for model training and testing.

Conflicts of Interest

Author Ole Runar Myhr was employed by the company Hydro Aluminium, Research and Technology Development (RTD). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Golovianko, M.; Terziyan, V.; Branytskyi, V.; Malyk, D. Industry 4.0 vs. Industry 5.0: Co-existence, Transition, or a Hybrid. Procedia Comput. Sci. 2023, 217, 102–113. [Google Scholar] [CrossRef]
  2. Capuzzi, S.; Timelli, G. Preparation and Melting of Scrap in Aluminum Recycling: A Review. Metals 2018, 8, 249. [Google Scholar] [CrossRef]
  3. Wang, J.; Li, Y.; Gao, R.X.; Zhang, F. Hybrid physics-based and data-driven models for smart manufacturing: Modelling, simulation, and explainability. J. Manuf. Syst. 2022, 63, 381–391. [Google Scholar] [CrossRef]
  4. Rudolph, M.; Kurz, S.; Rakitsch, B. Hybrid modeling design patterns. J. Math. Ind. 2024, 14, 3. [Google Scholar] [CrossRef]
  5. Glassey, J.; Stosch, M.V. (Eds.) Hybrid Modeling in Process Industries, 1st ed.; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar] [CrossRef]
  6. Von Stosch, M.; Oliveira, R.; Peres, J.; Feyo De Azevedo, S. Hybrid semi-parametric modeling in process systems engineering: Past, present and future. Comput. Chem. Eng. 2014, 60, 86–101. [Google Scholar] [CrossRef]
  7. Zendehboudi, S.; Rezaei, N.; Lohi, A. Applications of hybrid models in chemical, petroleum, and energy systems: A systematic review. Appl. Energy 2018, 228, 2539–2566. [Google Scholar] [CrossRef]
  8. Rai, R.; Sahu, C.K. Driven by Data or Derived Through Physics? A Review of Hybrid Physics Guided Machine Learning Techniques With Cyber-Physical System (CPS) Focus. IEEE Access 2020, 8, 71050–71073. [Google Scholar] [CrossRef]
  9. Gargalo, C.L.; Malanca, A.A.; Aouichaoui, A.R.N.; Huusom, J.K.; Gernaey, K.V. Navigating industry 4.0 and 5.0: The role of hybrid modelling in (bio)chemical engineering’s digital transition. Front. Chem. Eng. 2024, 6, 1494244. [Google Scholar] [CrossRef]
  10. Sharma, N.; Liu, Y.A. A hybrid science-guided machine learning approach for modeling chemical processes: A review. AIChE J. 2022, 68, e17609. [Google Scholar] [CrossRef]
  11. Shah, P.; Pahari, S.; Bhavsar, R.; Kwon, J.S.I. Hybrid modeling of first-principles and machine learning: A step-by-step tutorial review for practical implementation. Comput. Chem. Eng. 2025, 194, 108926. [Google Scholar] [CrossRef]
  12. Claes, Y.; Huynh-Thu, V.A.; Geurts, P. Hybrid additive modeling with partial dependence for supervised regression and dynamical systems forecasting. arXiv 2023, arXiv:2307.02229. [Google Scholar] [CrossRef]
  13. Marioara, C.D.; Nordmark, H.; Andersen, S.J.; Holmestad, R. Post-β′′ phases and their influence on microstructure and hardness in 6xxx Al-Mg-Si alloys. J. Mater. Sci. 2006, 41, 471–478. [Google Scholar] [CrossRef]
  14. Andersen, S.; Zandbergen, H.; Jansen, J.; TrÆholt, C.; Tundal, U.; Reiso, O. The crystal structure of the β′′ phase in Al–Mg–Si alloys. Acta Mater. 1998, 46, 3283–3298. [Google Scholar] [CrossRef]
  15. Gupta, A.; Lloyd, D.; Court, S. Precipitation hardening in Al–Mg–Si alloys with and without excess Si. Mater. Sci. Eng. A 2001, 316, 11–17. [Google Scholar] [CrossRef]
  16. Andersen, S.; Marioara, C.; Vissers, R.; Frøseth, A.; Zandbergen, H. The structural relation between precipitates in Al–Mg–Si alloys, the Al-matrix and diamond silicon, with emphasis on the trigonal phase U1-MgAl2Si2. Mater. Sci. Eng. A 2007, 444, 157–169. [Google Scholar] [CrossRef]
  17. Bahrami, A.; Mehr, M.Y.; Anijdan, S.H.M. Precipitation in Al–Mg–Si Alloys Modeling. In Encyclopedia of Aluminum and Its Alloys, 1st ed.; Totten, G.E., Tiryakioğlu, M., Kessler, O., Eds.; CRC Press Taylor & Francis: Boca Raton, FL, USA, 2018. [Google Scholar] [CrossRef]
  18. Myhr, O.R.; Grong, O.; Schäfer, C. An Extended Age-Hardening Model for Al-Mg-Si Alloys Incorporating the Room-Temperature Storage and Cold Deformation Process Stages. Metall. Mater. Trans. A 2015, 46, 6018–6039. [Google Scholar] [CrossRef]
  19. Myhr, O.R.; Marioara, C.D.; Engler, O. Modeling the Effect of Excess Vacancies on Precipitation and Mechanical Properties of Al–Mg–Si Alloys. Metall. Mater. Trans. A 2024, 55, 291–302. [Google Scholar] [CrossRef]
  20. Dumitraschkewitz, P.; Uggowitzer, P.J.; Gerstl, S.S.A.; Löffler, J.F.; Pogatscher, S. Size-dependent diffusion controls natural aging in aluminium alloys. Nat. Commun. 2019, 10, 4746. [Google Scholar] [CrossRef]
  21. Rahman, A.; Hossain, M.S.; Siddique, A.B. Review: Machine learning approaches for diverse alloy systems. J. Mater. Sci. 2025, 60, 12189–12221. [Google Scholar] [CrossRef]
  22. Øien, C.D.; Ringen, G. Data-driven through-process modelling of aluminum extrusion: Predicting mechanical properties. Manuf. Lett. 2024, 41, 1274–1281. [Google Scholar] [CrossRef]
  23. Carvalho, H.D.P.D.; Oliveira, J.F.L.D.; Fagundes, R.A.D.A. Dynamic selection of ensemble-based regression models: Systematic literature review. Expert Syst. Appl. 2025, 290, 128429. [Google Scholar] [CrossRef]
  24. Abbod, M.; Linkens, D.; Zhu, Q.; Mahfouf, M. Physically based and neuro-fuzzy hybrid modelling of thermomechanical processing of aluminium alloys. Mater. Sci. Eng. A 2002, 333, 397–408. [Google Scholar] [CrossRef]
  25. Sellars, C.; Abbod, M.F.; Zhu, Q.; Linkens, D. Hybrid Modelling Methodology Applied to Microstructural Evolution during Hot Deformation of Aluminium Alloys. Mater. Sci. Forum 2003, 426–432, 27–34. [Google Scholar] [CrossRef]
  26. Zhu, Q.; Abbod, M.; Talamantes-Silva, J.; Sellars, C.; Linkens, D.; Beynon, J. Hybrid modelling of aluminium–magnesium alloys during thermomechanical processing in terms of physically-based, neuro-fuzzy and finite element models. Acta Mater. 2003, 51, 5051–5062. [Google Scholar] [CrossRef]
  27. Abbod, M.; Zhu, Q.; Linkens, D.; Sellars, C.; Mahfouf, M. Hybrid models for aluminium alloy properties prediction. Control Eng. Pract. 2006, 14, 537–546. [Google Scholar] [CrossRef]
  28. Wang, H.; Colegrove, P.A.; Dos Santos, J. Hybrid modelling of 7449-T7 aluminium alloy friction stir welded joints. Sci. Technol. Weld. Join. 2013, 18, 147–153. [Google Scholar] [CrossRef]
  29. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  30. Ury, N.; Neuberger, R.; Sargent, N.; Xiong, W.; Arróyave, R.; Otis, R. Kawin: An open source Kampmann–Wagner Numerical (KWN) phase precipitation and coarsening model. Acta Mater. 2023, 255, 118988. [Google Scholar] [CrossRef]
  31. Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A Comparative Analysis of XGBoost. arXiv 2019, arXiv:1911.01914. [Google Scholar] [CrossRef]
  32. Øien, C.D.; Myhr, O.R.; Ringen, G. Towards hybrid modelling of aluminium extrusion mechanical properties—A univariate representation of artificial aging. Mater. Res. Proc. 2025, 54, 819–828. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.