Next Article in Journal
Experimental Study on Hydraulic Fracture Propagation in Multilayered Tight Sandstone Reservoirs of the Tarim Basin in China
Next Article in Special Issue
Collaborative Optimization Between Efficient Thermal Dissipation and Microstructure of Ceramic Matrix Composite Component Under Non-Uniform Thermal Loads
Previous Article in Journal
Fault-Tolerance Strategies in Multilevel Converters: An Overview
Previous Article in Special Issue
Eco-Friendly and Sustainable One-Component Polyurethane Syntactic Foams Reinforced with Fly Ash Cenospheres for Acoustic and Thermal Insulation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ensemble-Based Material-Specific Prediction of Thermal Conductivity for Steel Slag Asphalt Mixtures

1
Xinjiang Jiaotou Construction Management Co., Ltd., Urumchi 830000, China
2
School of Highway, Chang’an University, Xi’an 710064, China
3
Key Laboratory of Special Area Highway Engineering, Ministry of Education, Xi’an 710064, China
4
International Joint Laboratory for Sustainable Development of Highway Infrastructures in Special Regions, Xi’an 710064, China
*
Author to whom correspondence should be addressed.
Processes 2026, 14(4), 689; https://doi.org/10.3390/pr14040689
Submission received: 26 January 2026 / Revised: 6 February 2026 / Accepted: 14 February 2026 / Published: 18 February 2026
(This article belongs to the Special Issue Thermal Properties of Composite Materials)

Abstract

Thermal conductivity is a crucial parameter for heat transfer in asphalt pavements, especially in cold regions where electrically heated snow-melting systems are used. Steel slag, an industrial by-product with high thermal conductivity, holds significant potential to enhance the thermal performance of asphalt mixtures. However, its thermal behavior is influenced by various factors. This study established a thermal conductivity database consisting of 200 samples from published experimental studies, incorporating data collection, graphical digitization, and physically constrained expansion. Mixture composition, volumetric structure, and steel slag properties were used as input variables, with thermal conductivity as the output. Five machine learning models including k-nearest neighbors regression, decision tree, random forest, support vector regression, and gradient boosting were developed. Among them, random forest and gradient boosting showed the highest accuracy and robustness. Feature importance analysis revealed that steel slag content is the primary factor affecting thermal conductivity, while material properties and gradation parameters play secondary roles. This data-driven framework facilitates the efficient prediction and design of thermal conductivity in steel slag asphalt mixtures, supporting the engineering application of functional asphalt pavements.

1. Introduction

With the continuous expansion of transportation infrastructure in cold and alpine regions, the impact of road snow accumulation and icing on traffic safety and operational efficiency has become increasingly prominent. Icy pavements significantly reduce tire-pavement skid resistance, extend braking distances, and increase the risk of traffic accidents. Traditional de-icing methods relying on mechanical removal or chemical melting agents, while widely applied in engineering practice, face limitations such as restricted efficiency, environmental pollution, and potential adverse effects on pavement structural durability. These limitations make them inadequate to meet the comprehensive demands for green, safe, and efficient road operation. Consequently, the development of novel functional pavement materials and snow-melting technologies has emerged as a critical research direction in the field of road engineering [1].
Among various innovative snow-melting technologies, electrically heated pavements have attracted significant attention due to their advantages of high melting efficiency, rapid response, and the potential for active and controllable snow/ice removal. This technology typically involves embedding conductive or heating elements within the pavement structure to convert electrical energy into heat, which is then transferred to the surface to melt snow or thin ice. In this process, the thermophysical properties of the pavement materials are decisive for operational efficiency. The thermal conductivity of asphalt mixtures directly influences the heat transfer rate and distribution uniformity within the pavement structure [2,3]. Insufficient thermal conductivity can lead to localized heat accumulation, reducing overall melting efficiency and increasing energy consumption. Therefore, enhancing the thermal conductivity of asphalt mixtures is a crucial pathway to improving the overall performance of electrically heated pavements [4].
Asphalt mixture is a typical multiphase composite material. Its thermal behavior is governed by multiple factors. These factors include material composition, aggregate skeleton structure, and void characteristics. In conventional asphalt mixtures, the thermal conductivities of the constituent phases differ significantly. These phases mainly include aggregates, asphalt binder, and air. Given that air possesses a thermal conductivity far lower than that of aggregates and asphalt, a relatively high air void content can substantially impair the overall thermal conductivity of the material [5,6]. Hence, an effective strategy for improving thermal conductivity involves optimizing aggregate type and structural composition, as well as incorporating high-conductivity phases to improve heat transfer pathways. Recent ML-based studies have increasingly addressed thermal and energy-related pavement properties. Yu et al. developed a hybrid finite element–neural network approach for estimating the thermal conductivity of porous asphalt concrete, while Kebede et al. applied ensemble learning for real-time pavement temperature prediction [7,8]. Neural-network-aided homogenization has also been used to predict the effective thermal conductivity of composite construction materials, providing context for the present material-specific modeling of steel slag asphalt mixtures.
Steel slag is a major solid by-product of the steel industry. It is characterized by high density and high hardness. It also exhibits higher thermal conductivity than conventional natural aggregates. Steel slag exhibits 2–3 times higher thermal conductivity compared to conventional natural aggregates, which typically have thermal conductivity values ranging from 0.7 to 1.2 W/(m·K). This substantial difference in conductivity highlights the potential of steel slag to improve the thermal performance of asphalt mixtures. The incorporation of steel slag into asphalt mixtures shows strong potential for improving thermal performance [9,10,11]. It also promotes the resource utilization of industrial solid waste. This approach aligns with the current development trend toward green and low-carbon transportation infrastructure. Existing studies indicate that the iron-rich minerals and dense microstructure inherent to steel slag are conducive to forming continuous heat conduction channels, thereby significantly enhancing the thermal conductivity of asphalt mixtures [12,13]. However, the thermal behavior of steel slag asphalt mixtures is affected by multiple interacting factors. These factors include steel slag content, particle size distribution, and replacement method. Changes in volumetric parameters further complicate the internal heat transfer mechanism. Their combined effects are complex and difficult to decouple [14,15].
Current studies on the thermal conductivity of steel slag asphalt mixtures mainly rely on laboratory experiments. Most focus on the influence of slag replacement rate or aggregate gradation. Such studies can reveal empirical trends. However, experimental workloads are substantial [16,17]. Parameter combinations are often limited. This makes it difficult to systematically characterize thermal conductivity under multi-factor conditions. In addition, differences in raw material sources, test methods, and experimental conditions exist among studies [18,19]. These inconsistencies hinder direct comparison of results. They also limit the broader applicability of experimental findings. Therefore, methods capable of rapidly and accurately predicting thermal conductivity under multi-parameter conditions are urgently needed [20].
In recent years, machine learning methods have been widely applied in engineering materials research. These methods outperform traditional regression approaches in handling high-dimensional inputs. They are also effective in capturing nonlinear relationships and complex coupling effects [21,22]. Models such as random forest, gradient boosting, and support vector regression have shown strong predictive accuracy and generalization capability. They have been successfully used to predict the mechanical and functional properties of concrete and asphalt mixtures. However, systematic predictive studies focusing on the thermal conductivity of steel slag asphalt mixtures remain limited. Comprehensive frameworks for multi-model comparison are still lacking. Parameter influence analysis has also received insufficient attention [23,24]. The novelty of this study lies in its material-specific modeling perspective rather than algorithmic complexity. By focusing exclusively on steel slag asphalt mixtures and adopting a tailored parameter system, this work complements existing generalized ML-based thermal conductivity prediction models.
Based on this research background, the present study focuses on the thermal conductivity of steel slag asphalt mixtures as a key functional indicator. As illustrated in Figure 1, a schematic overview of the study is presented to illustrate the main workflow and the significance of the research. A thermal conductivity database is established using experimental data reported in the literature. Multiple machine learning models are then developed for prediction. Their predictive performance is systematically compared. The relationships between mixture composition parameters and thermal conductivity are further analyzed. The results aim to support thermal-performance-oriented mixture design. They also provide a data-driven perspective for the engineering application of functional asphalt pavement materials.

2. Database Construction and Data Description

2.1. Data Sources and Collection

Since the database was compiled from multiple independent studies, inconsistencies in units, value ranges, and data formats were unavoidable. The database was compiled from published experimental studies on steel-slag asphalt mixtures following predefined screening rules to ensure data quality and comparability. A study was included only if it clearly reported the steel slag replacement level, the thermal conductivity value together with the measurement method and test conditions, and the key mixture/volumetric variables required in this work with unambiguous definitions. Studies were excluded when the test protocol was not described, when essential inputs were missing or not derivable from the reported information, or when units/definitions were inconsistent and could not be reliably converted. If multiple results were reported for the same mixture under different conditions, a record was retained only when the corresponding inputs and conditions were explicitly documented. All variables were then harmonized in units and nomenclature, and duplicates or internally inconsistent records were removed through cross-checking prior to modeling. Data preprocessing and rational data expansion were therefore conducted prior to model development. First, the units of all input variables and thermal conductivity values were unified. Samples with missing or incomplete information were removed. Only data with clear physical meaning and complete parameter descriptions were retained for modeling. To further increase the sample size and improve the model’s ability to capture parameter variation patterns, statistical data processing techniques were applied to extend the database. This expansion was performed without violating the distributions of the original experimental data. The 200-sample dataset was compiled entirely from published experimental studies. Physically constrained data expansion was used only as a training-stage augmentation approach, where resampling/interpolation was restricted to physically reasonable bounds and locally consistent experimental conditions. No augmented samples were included in the final dataset records. The adopted techniques included interpolation under neighboring experimental conditions, resampling within physically reasonable ranges, and consistency-constrained data expansion. The expanded data represent a statistically reasonable extension of existing experimental observations. This approach improved sample coverage and distribution continuity. The data expansion process involved generating synthetic samples through interpolation between real datapoints, constrained within physically reasonable bounds. No extrapolation beyond the observed ranges of the original experimental data was performed, ensuring that all augmented samples remained consistent with the physical conditions of the dataset. In addition, to mitigate the influence of variable scale differences on model training, all continuous input variables were standardized to comparable numerical ranges. This step effectively enhanced training stability and model convergence. The adopted preprocessing and expansion strategies improved the completeness and representativeness of the database. They provided a reliable data foundation for machine learning-based prediction of thermal conductivity in steel slag asphalt mixtures.

2.2. Variables for Thermal Conductivity Prediction

The thermal conductivity of steel slag asphalt mixtures is governed by the combined effects of material composition, internal structural characteristics, and testing conditions. Based on previous experimental studies and fundamental heat transfer mechanisms, the input variables were selected from two main aspects. These aspects include mixture composition parameters and volumetric–physical structural parameters. They were used to systematically describe the key factors influencing thermal conductivity.
As shown in Table 1, The mixture composition parameters mainly include steel slag content and asphalt content. Steel slag content determines the proportion of high-thermal-conductivity phases in the mixture. It is therefore considered the primary controlling factor for thermal performance. The concept of the percolation threshold is introduced to explain that thermal conductivity increases significantly when the high-conductivity phase (steel slag) forms a continuous conductive path within the aggregate structure. However, this threshold is not a fixed value and depends on factors such as mixture gradation, particle shape, compaction, and air-void characteristics. Asphalt content affects aggregate contact conditions and the degree of void filling. It consequently alters heat transfer paths and thermal transport efficiency. The volumetric and physical structural parameters include air voids, voids in mineral aggregate (VMA), and voids filled with asphalt (VFA). Air voids represent the volume fraction of the low-thermal-conductivity air phase. They have a pronounced influence on heat transfer behavior [24]. Density-related volumetric indicators provide an integrated description of mixture compactness and internal structure. They serve as important indirect descriptors of thermal conduction. Thermal conductivity was selected as the output variable to characterize the steady-state heat transfer capacity of steel slag asphalt mixtures.
Based on this variable system, a hierarchical indicator framework was established from three levels. These levels include material composition, internal structure, and material properties. Composition-level indicators describe the proportion of high- and low-conductivity phases. Structural-level indicators reflect the aggregate skeleton and pore structure. Material-level indicators characterize the physical and thermal properties of steel slag aggregates. The value ranges of all input variables were determined according to commonly adopted design intervals reported in previous studies. To maintain consistency with engineering practice, steel slag content was mainly limited to 0–40%. A small number of samples in the range of 40–55% were included to represent high replacement conditions. Other structural and material parameters were constrained within typical literature-reported ranges. The intrinsic thermal conductivity of steel slag was treated as a sample-specific variable, meaning it was extracted from each source study for the corresponding datapoint. In cases where kSS values were missing, we either excluded the sample from the analysis or imputed the value using a conservative approach based on the available data. This ensures that the variability in slag sources is adequately captured in the model, providing a more accurate representation of the material properties. Under the premise of physical rationality and engineering applicability, a physically constrained data expansion strategy was adopted. A dataset of 200 samples was finally constructed. This approach improves model stability and generalization while avoiding unrealistic extreme values.

2.3. Statistical Characteristics of the Dataset

The systematic statistical analysis was conducted on the input variables and output indicator used for thermal conductivity prediction of steel slag asphalt mixtures. The dataset contains 200 samples. The input variables include steel slag content, asphalt content, air voids, voids in mineral aggregate, voids filled with asphalt, coarse and fine aggregate ratios, and key material and structural parameters. These parameters include steel slag density, water absorption, and thermal conductivity. The output variable is the thermal conductivity of the steel slag asphalt mixture.
As shown in Table 2, The statistical results show that all input variables fall within reasonable engineering ranges. The dataset covers a wide range of mixture compositions and volumetric structural conditions associated with different steel slag incorporation levels. Clear variation exists between the minimum and maximum values of each variable. The mean and median values are close. No obvious outliers deviating from practical engineering conditions are observed. These characteristics provide a solid data basis for machine learning models to learn complex nonlinear relationships.
From the perspective of distribution characteristics, most variables exhibit skewness values close to zero. This indicates approximately symmetric distributions. Only a few variables show slight positive or negative skewness. This behavior is mainly attributed to the constraints imposed by mixture design specifications in engineering practice. Several variables also present negative kurtosis values. This suggests flatter distributions than the normal distribution, with a lower probability of extreme values. These features reflect the stability and controllability of the data acquisition process. Preliminary correlation analysis was further performed between the input variables and thermal conductivity. Steel slag content, air voids, and apparent density of steel slag show noticeable association trends. The variation in the intrinsic thermal conductivity of steel slag observed in this study can be linked to differences in slag mineralogy, such as the content of iron-bearing phases like metallic iron and iron oxides. This relationship is supported by studies on slag characterization and thermal transport properties, although the correlation remains qualitative due to the inconsistent availability of detailed mineralogical data across source studies. This geological perspective provides a deeper understanding of the observed kSS values, while acknowledging the limitations in the dataset. However, these relationships do not exhibit strong linear characteristics. This indicates that thermal conductivity is governed by the coupled effects of multiple composition and structural parameters. Single-variable analysis or linear modeling is therefore insufficient. The use of multiple machine learning regression models is necessary to capture the complex nonlinear mapping between multidimensional inputs and thermal conductivity. This approach provides a reliable tool for the design and performance evaluation of thermally functional asphalt pavement materials.
To systematically examine the distribution characteristics of the input parameters and thermal conductivity in the steel slag asphalt mixture database, a statistical analysis was performed. The variables include steel slag content, asphalt content, air voids, VMA, VFA, coarse-to-fine aggregate ratio, and the physical properties of steel slag. Their distribution patterns are shown in Figure 2a–k. The results indicate that all input parameters fall within reasonable engineering ranges. The distributions are generally continuous and show broad coverage.
Steel slag content ranges from 0 to 55%. Most samples are concentrated between 20% and 40%. This reflects the common engineering practice of partial replacement of natural aggregates with steel slag. The observed increase in thermal conductivity from 1.30 to 1.70 W/(m·K) falls within the theoretical limits for conductivity enhancement by dispersed particles, as discussed in the comprehensive review. This theoretical framework will be referenced to provide context for interpreting the observed enhancement in thermal conductivity of steel slag asphalt mixtures [25]. The Pearson correlation of 0.92 between steel slag content and thermal conductivity suggests a strong linear relationship. However, as discussed in the referenced review, effective thermal conductivity in dispersed composites can be governed by different averaging models, such as the arithmetic mean and harmonic mean. We acknowledge that multiple factors, including particle dispersion, gradation, and binder content, contribute to the observed relationship and that different physical models may be applicable in explaining the effective thermal conductivity in steel slag asphalt mixtures [25]. Asphalt content varies between 4.8% and 5.6%. The distribution peaks in the range of 5.0–5.4% and is approximately symmetric. This indicates that the mixture designs comply with conventional requirements for stable binder–aggregate ratios. Air void content is mainly distributed between 3.0% and 5.0%, with most values concentrated in the range of 3.8–4.6%. VMA ranges from 14.5% to 17.5% and peaks between 15.5% and 16.5%. This suggests that the dataset covers volumetric structures from relatively dense to moderately open. VFA is mainly concentrated between 70% and 78%. The coarse aggregate ratio ranges from 50% to 62%, while the fine aggregate ratio is primarily between 38% and 50%. All gradation-related parameters exhibit relatively flat distributions. This reflects reasonable variation under the constraints of mix design specifications. In terms of material properties, the apparent density of steel slag is mainly distributed between 3.30 and 3.60 g/cm3. Most samples fall within 3.40–3.55 g/cm3. Water absorption varies from 2.0% to 3.0%, with a peak at approximately 2.3–2.6%. The intrinsic thermal conductivity of steel slag is primarily concentrated between 2.1 and 2.9 W/(m·K). It shows relatively high values with moderate dispersion. Correspondingly, the thermal conductivity of steel slag asphalt mixtures ranges from 1.30 to 1.70 W/(m·K). Most samples are concentrated between 1.45 and 1.55 W/(m·K). The distribution is close to normal. No obvious skewness or extreme outliers are observed.
These results indicate that thermal conductivity is not controlled by a single parameter. It is jointly influenced by steel slag content, volumetric characteristics, and material properties. Clear nonlinear coupling relationships exist among the variables. Overall, all input parameters exhibit good engineering rationality and statistical stability. The dataset avoids excessive concentration and unrealistic dispersion. It therefore provides a reliable data basis for machine learning models to learn complex nonlinear mappings between multidimensional inputs and thermal conductivity. This establishes a solid foundation for prediction of the thermal performance of steel slag asphalt mixtures.
To elucidate the relationships between input parameters and the thermal conductivity of steel slag asphalt mixtures, this study employed Pearson correlation coefficient and Kendall’s rank correlation coefficient to analyze inter-variable associations. The results are presented in Figure 3a,b.
From the Pearson correlation matrix, a strong positive correlation is observed between steel slag content and the thermal conductivity of the mixture, with a correlation coefficient of 0.92. While the Pearson correlation of 0.92 between steel slag content and thermal conductivity is high, it does not imply that the relationship between all input features and thermal conductivity is purely linear. The influence of slag content is likely coupled with other features such as air voids, binder content, and gradation, which may exhibit nonlinear interactions. Simple high-order polynomial regression would fail to capture these complexities. Therefore, we chose more complex models like Gradient Boosting, which can account for the nonlinearities and interactions between multiple features. A baseline comparison with polynomial regression models shows that Gradient Boosting provides better accuracy and generalization for this multivariate, nonlinear problem. Meanwhile, the intrinsic thermal conductivity of steel slag also shows a positive correlation with that of the mixture, though its degree of correlation is significantly weaker than that of steel slag content. This result indicates that steel slag content exerts a more direct influence on thermal conductivity within typical engineering mix design ranges. Its effect is stronger than that of individual material property parameters. In contrast, the absolute Pearson correlation coefficients of volumetric and structural parameters are all below 0.1. These parameters include asphalt content, air void content, VMA, and VFA. This indicates generally weak linear relationships with mixture thermal conductivity. These parameters are therefore not dominant controlling factors in a linear sense. However, they may influence thermal performance indirectly through coupling with other material and structural variables. The results of Kendall’s correlation analysis further support the above trends. The Kendall coefficient between steel slag content and mixture thermal conductivity remains relatively high. This indicates a clear monotonic relationship in addition to the observed linear association. The rank variations in thermal conductivity therefore show consistent dependence on steel slag content. In contrast, the Kendall coefficients of most other input parameters fall within the range of −0.15 to 0.10. This suggests that their univariate rank variations have limited influence on thermal conductivity. Furthermore, the coarse aggregate ratio and fine aggregate ratio exhibit a strong negative correlation close to −1 in both correlation analyses. This result is consistent with engineering principles, as these two parameters are mutually constrained in gradation design. It also further confirms the engineering rationality and internal consistency of the dataset.
In summary, the correlation analyses indicate that thermal conductivity of steel slag asphalt mixtures is mainly governed by steel slag content. In contrast, the effects of volumetric–structural parameters and steel slag physical properties are not strongly linear. Their influences are weakly correlated or coupled with multiple factors. Most input variables therefore do not exhibit clear linear or monotonic relationships with thermal conductivity. Correlation analysis alone is insufficient to fully describe the underlying mechanisms. This observation further demonstrates the necessity of introducing machine learning methods. Such methods are required to capture the complex nonlinear mapping between multidimensional input parameters and thermal conductivity.

3. Machine Learning Methodology

3.1. Selected Machine Learning Algorithms

To evaluate the applicability of different machine learning models for predicting the thermal conductivity of steel slag asphalt mixtures, five representative regression algorithms were selected. These algorithms cover distance-based methods, single decision tree models, ensemble learning approaches, and kernel-based models. The inclusion of models with different learning mechanisms allows a comprehensive comparison of their ability to capture complex nonlinear relationships between mixture parameters and thermal conductivity. The selected models include k-nearest neighbors regression (KNN), decision tree (DT), random forest (RF), support vector regression (SVR), and gradient boosting (GB).
  • K-nearest neighbors regression
K-nearest neighbors (KNN) regression is a typical instance-based and nonparametric learning method. It predicts target values by measuring the distance between a query sample and training samples in the feature space. The K nearest neighbors are selected. The prediction is obtained as a weighted average of their output values. KNN does not require an explicit model training process. It can directly reflect local distribution characteristics of the data. In this study, KNN is used as a baseline model. It is adopted to evaluate the applicability of a simple local-similarity-based approach for predicting the thermal conductivity of steel slag asphalt mixtures. By comparing its performance with that of more complex models, the contribution and limitations of local neighborhood information in thermal conductivity prediction can be systematically assessed.
2.
Decision Tree
Decision tree is a supervised learning model based on rule-based partitioning. It recursively divides the input feature space. Samples are mapped to different leaf nodes to perform regression prediction. The structure of a decision tree is intuitive. It provides good interpretability. To some extent, it can reveal the influence paths of input variables on the output. However, a single decision tree is sensitive to the training data. It is also prone to overfitting. In this study, the decision tree model is mainly used as a reference model. It serves as a baseline for comparison with ensemble methods such as random forest and gradient boosting. This comparison helps to evaluate the advantages of ensemble learning strategies in improving prediction stability and accuracy.
3.
Random Forest
As illustrated in Figure 4, random forest is an ensemble learning model based on the bagging strategy. It consists of multiple decision trees. During model construction, bootstrap sampling is applied to the original dataset to generate diverse training subsets. At each node split, a random subset of features is selected. This process enhances model diversity and reduces overfitting. The final prediction is obtained by averaging the outputs of all decision trees. The thermal conductivity of steel slag asphalt mixtures is influenced by multiple parameters. These parameters exhibit strong nonlinear coupling relationships. Random forest is therefore well suited for this problem. It demonstrates high stability and strong generalization capability when handling complex nonlinear relationships and noisy data. In this study, random forest is selected as a key comparative model. It is used to evaluate the effectiveness of the bagging ensemble strategy in predicting the thermal performance of steel slag asphalt mixtures.
4.
Support Vector Regression
As illustrated in Figure 5, support vector regression (SVR) is a kernel-based model grounded in statistical learning theory. It maps input variables into a high-dimensional feature space using kernel functions. An optimal regression function is then constructed in this space. This function balances prediction accuracy and model complexity. SVR shows clear advantages in handling nonlinear relationships. It is also effective for datasets with limited sample sizes. The thermal conductivity database of steel slag asphalt mixtures is mainly derived from experimental studies. It is characterized by a relatively small number of samples and high-dimensional input parameters. Therefore, SVR is selected as a representative kernel-based model in this study. Its performance in predicting thermal conductivity is evaluated and compared with tree-based and ensemble learning models.
5.
Gradient boosting
As shown in Figure 6, gradient boosting is an ensemble learning method based on the boosting principle. It improves overall prediction performance by sequentially fitting the residuals of previous models. Unlike bagging-based approaches, gradient boosting emphasizes the sequential dependency among base learners. This strategy enables the model to progressively capture complex nonlinear relationships between input and output variables. In this study, gradient boosting is adopted as a representative boosting-based ensemble method. It is used to evaluate its capability in predicting the thermal conductivity of steel slag asphalt mixtures. By comparing its performance with that of random forest and other models, the differences among various ensemble learning strategies can be further examined for this specific prediction task.

3.2. Model Training and Validation

To objectively evaluate the performance of different machine learning models in predicting the thermal conductivity of steel slag asphalt mixtures, a unified training and validation strategy was adopted. This strategy was applied to all models. It was designed to avoid dependence of the evaluation results on a specific data partition. Considering that the sample size of the thermal conductivity database is relatively limited and the data are collected from different sources. K-fold cross-validation approach was employed to assess model performance [26,27]. The K-fold cross-validation procedure involved random shuffling of the dataset before splitting, with a fixed random seed of to ensure consistent data partitioning. Stratification was not applied since the dataset does not involve imbalanced classes. All models were evaluated using the same K-fold splits to ensure a fair, paired comparison across algorithms. As illustrated in Figure 7, the entire dataset was randomly divided into five subsets, among which four subsets were used for model training and the remaining one subset was used for validation. This procedure was repeated five times so that each subset served as the validation set once. The final model performance was determined by averaging the results obtained from the five validation runs, which allows efficient utilization of the limited data while improving the stability and reliability of the performance evaluation [28,29]. All continuous input variables were standardized before model training using the z-score standardization method. This standardization was applied consistently across all models to ensure fair comparison and to address potential performance issues, especially for KNN and SVR.
During model training, systematic hyperparameter optimization was conducted for each machine learning model to account for their different sensitivities to parameter settings [30,31]. Within predefined parameter search ranges, the optimal hyperparameter combinations that minimized validation errors were selected for final model construction based on cross-validation results. Specifically, the hyperparameters of each model were optimized. For the k-nearest neighbors regression model, the number of neighbors and the distance weighting scheme were tuned. For the decision tree model, the maximum tree depth and the minimum number of samples required for node splitting were adjusted. For the random forest model, the number of decision trees and the size of the feature subset were optimized. For the support vector regression model, the kernel function and its associated parameters were carefully selected. For the gradient boosting model, the learning rate, the number of base learners, and the tree structure parameters were optimized. All models were trained and optimized using the same dataset, input variables, and validation strategy to ensure a fair and consistent comparison of predictive performance.
To comprehensively assess the predictive capability of each model, the coefficient of determination (R2), root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) were adopted as evaluation metrics, as summarized in Table 3. Among these metrics, R2 was used to quantify the ability of the models to capture the variation trend of thermal conductivity, while RMSE and MAE were employed to measure the magnitude of prediction errors between the predicted and experimental values. Within the cross-validation framework, all evaluation metrics were reported as the average values obtained from the five validation folds. Through this unified training, validation, and evaluation procedure, differences among the machine learning models can be systematically compared. These differences include prediction accuracy, stability, and nonlinear modeling capability. The results provide a reliable basis for subsequent analysis and discussion.

3.3. Methodological Framework

The research framework of this study is illustrated in Figure 8. The overall workflow consists of four main stages. These stages include dataset construction, dataset preprocessing, machine learning model training and validation, and thermal conductivity prediction with model performance evaluation. This framework is designed to predict the thermal conductivity of steel slag asphalt mixtures. It also enables a systematic assessment of the performance of the developed machine learning models. The key hyperparameters of the machine learning models were optimized to ensure robust predictive performance. k-Nearest Neighbor models used 4–6 neighbors with uniform weighting and Euclidean distance. Decision Tree models were configured with a maximum depth of 8–12 and a minimum of 2–4 samples per leaf. Random Forest models employed 150–250 trees with a maximum depth of 12–16 and a minimum of 2 samples per leaf. Gradient Boosting models used 150–250 trees, maximum depth of 10–14, and a learning rate of 0.03–0.07. Support Vector Regression models employed an RBF kernel with C approximately 90–110 and epsilon around 0.01. All hyperparameters were determined via grid search combined with cross-validation to achieve optimal model performance.

4. Results and Discussion

4.1. Performance Comparison of Machine Learning Models

As shown in Figure 9a,b, the predicted and measured thermal conductivity values obtained from the five machine learning models all exhibit clear positive correlations. This indicates that each model is able to learn the relationship between mixture composition, volumetric–structural parameters, steel slag properties, and thermal conductivity. However, evident differences are observed among the models in prediction accuracy, fitting consistency, and sensitivity across different thermal conductivity ranges [14].
For the KNN model, most predicted values are distributed near the ideal diagonal. The fitted relationship is y = 0.82x + 0.2708. This suggests that KNN can capture the overall variation trend of thermal conductivity. However, the fitted slope is lower than the ideal value of 1. The model tends to underestimate thermal conductivity at higher values. This behavior is mainly caused by the local averaging mechanism of KNN. Sparse samples in the high-conductivity range reduce the model’s sensitivity to extreme values. As a result, KNN is more suitable for trend identification than for precise nonlinear prediction. The decision tree model shows improved performance compared with KNN. The fitted equation is y = 0.86x + 0.1992. Predicted points are more concentrated, especially in the medium conductivity range. This indicates that decision trees can capture nonlinear influence paths through recursive feature partitioning [32,33]. Nevertheless, some dispersion remains in local regions. This reflects the sensitivity of a single tree model to sample distribution. Among all models, the random forest model exhibits balanced predictive behavior. The fitted equation is y = 0.85x + 0.2264. The predicted values are closely distributed around the diagonal across the entire range. No obvious systematic bias is observed. By integrating multiple decision trees through bagging, random forest effectively reduces model instability. It also enhances generalization capability. The model maintains good accuracy in the medium-to-high conductivity range, indicating strong robustness [34].
In contrast, the support vector regression model shows relatively weaker performance. The fitted slope is 0.68, and the intercept increases to 0.5031. This indicates overestimation at low values and underestimation at high values. The response to conductivity variation is insufficient. This behavior may be attributed to kernel parameter settings and sample distribution in the feature space. When strong nonlinear coupling exists, a single kernel function may struggle to fully represent the complex mapping. The gradient boosting model shows the best fitting consistency among all models. The fitted equation is y = 0.87x + 0.1958. Predicted points closely follow the ideal diagonal across the full range. High accuracy is particularly maintained in the high-conductivity region. By sequentially fitting residuals, gradient boosting progressively corrects prediction errors. This enables more effective capture of complex nonlinear relationships. The model achieves a favorable balance between accuracy and stability.
All five models are capable of predicting the thermal conductivity of steel slag asphalt mixtures. However, ensemble learning models outperform single and kernel-based models. Random forest and gradient boosting show superior accuracy, stability, and consistency across the full value range [35]. Among them, gradient boosting demonstrates the best overall performance under the current dataset conditions. This highlights the strong suitability of ensemble learning methods for modeling the multi-parameter nonlinear thermal behavior of steel slag asphalt mixtures.
As shown in Figure 10a,b, the prediction results of the five machine learning models generally follow the variation trend of the thermal conductivity dataset under both training and testing conditions. The predicted curves closely track the fluctuations of the measured values. This indicates that all models exhibit a certain level of fitting capability [35,36]. The green dashed line separates the training and testing phases. Within the testing interval, no abrupt deviations or abnormal fluctuations are observed in the predicted curves. This suggests that none of the models suffer from severe overfitting and that acceptable generalization performance is achieved. In terms of magnitude, thermal conductivity values are mainly distributed between 1.3 and 1.7 W/(m·K). All models show relatively stable predictive behavior within this range.
Further examination of the absolute error distributions reveals differences among the models. The KNN and decision tree models show more dispersed error distributions. Relatively large errors occur at some local sample points. This reflects the limitations of local-neighborhood-based methods and single-tree partitioning in capturing complex nonlinear relationships [36]. In contrast, the random forest model exhibits a more concentrated error distribution. Error fluctuations are smaller overall. This indicates that integrating multiple decision trees effectively reduces prediction instability. The support vector regression model maintains low errors for most samples. However, noticeable deviations appear at a few points. This behavior is consistent with its sensitivity to kernel parameter settings and sample distribution.
Among all models, gradient boosting demonstrates the most stable predictive performance in both training and testing stages. Its predicted curve shows the highest consistency with the measured values. Absolute errors are generally low and tightly clustered. This indicates that sequential residual fitting enables continuous error correction. As a result, gradient boosting more effectively captures the combined effects of steel slag content, volumetric–structural parameters, and steel slag physical properties on thermal conductivity. Considering both curve consistency and error distribution, ensemble learning models exhibit superior stability and reliability. Gradient boosting performs best under the current dataset conditions. It is therefore well suited as the preferred model for subsequent engineering applications and parameter analysis. A comparison with simple linear regression–based empirical models was conducted to evaluate the added value of ML. The empirical models showed lower prediction accuracy and higher errors, particularly in capturing nonlinear relationships among mixture composition parameters. In contrast, ML models automatically learn complex patterns from data, offering improved flexibility, robustness, and predictive performance for steel slag asphalt mixtures.
As shown in Figure 11, all machine learning models exhibit good fitting performance on both the training and test sets in terms of the coefficient of determination (R2). However, clear performance differences exist among the models. Random Forest (RF) and Gradient Boosting (GB) both achieve R2 values above 0.85 on the training set. On the test set, their R2 values remain high, ranging from 0.83 to 0.85. These values are significantly higher than those of the KNN, DT, and SVR models. This indicates that ensemble learning models are more effective in capturing the complex nonlinear relationships between thermal conductivity and multidimensional input parameters. In contrast, the SVR model shows a relatively lower R2 on the test set. This reflects certain limitations in its generalization capability under the current data scale and feature space [37,38].
In terms of error metrics, Random Forest achieves the lowest RMSE and MAE on both datasets. The RMSE is approximately 0.03 W/(m·K), and the MAE is about 0.025 W/(m·K). The error distribution is highly concentrated, indicating strong prediction stability. The Gradient Boosting model shows error levels close to those of Random Forest. Its RMSE and MAE on the test set are only slightly higher than RF. However, they are markedly lower than those of the KNN, DT, and SVR models. The KNN and DT models exhibit larger errors, especially on the test set. This suggests higher sensitivity to data distribution and lower predictive stability.
Further evaluation using the Mean Absolute Percentage Error (MAPE) shows consistent trends. Random Forest achieves the lowest test-set MAPE at approximately 1.8%. Gradient Boosting follows closely at about 1.9%. The SVR model records the highest MAPE, exceeding 2.4%. Considering all evaluation metrics together, ensemble learning models demonstrate the best overall performance. They show superior accuracy, stability, and generalization capability. Random Forest exhibits a slight overall advantage [38,39]. Therefore, Random Forest and Gradient Boosting are well suited as core models for subsequent feature importance analysis and engineering application studies.

4.2. Quantitative Effects of Mixture Parameters on Thermal Conductivity

As shown in Figure 12, the five machine learning models show minor differences in variable ranking. However, a high level of overall consistency is observed. Steel slag content exhibits the highest importance in all models, including KNN, DT, RF, SVR, and GB. Its feature weight is markedly higher than those of the other variables. In tree-based models, especially RF and GB, steel slag content plays a dominant role and accounts for a substantially larger proportion of the total importance. These results clearly identify steel slag content as the primary controlling factor governing the thermal conductivity of steel slag asphalt mixtures [39,40]. From a physical perspective, steel slag has relatively high density and thermal conductivity. Increasing its content enhances the continuity of the high-conductivity mineral skeleton. This promotes the formation of effective heat transfer pathways and directly improves thermal conductivity. Voids filled with asphalt (VFA) and air voids generally form the second tier of influential variables. Their rankings are relatively stable across the RF, GB, and SVR models. The importance of VFA indicates that the degree of asphalt filling within aggregate voids significantly affects thermal conduction. Higher VFA reduces the proportion of air, which is a low-conductivity phase, and thus lowers thermal resistance. The importance of air voids further confirms the role of internal pore structure. Increased air void content disrupts heat transfer continuity and reduces thermal conductivity. These results demonstrate that thermal behavior is governed by the coupled effects of aggregate skeleton structure, pore system, and asphalt mastic.
Variables with moderate importance, such as coarse and fine aggregate ratios, show relatively stable contributions across models. This suggests that gradation influences thermal conductivity mainly in an indirect manner. Aggregate proportions affect the formation of a stable mineral skeleton. Their influence is primarily realized through regulation of volumetric parameters, such as air voids and asphalt distribution. As a result, their importance is lower than that of steel slag content and key volumetric indicators. In contrast, steel slag material properties, including density, water absorption, and intrinsic thermal conductivity, show consistently lower importance [41,42]. This indicates that when these properties vary within a limited range, their influence is mainly reflected through steel slag content rather than through minor parameter variations.
Further comparison shows that tree-based models highlight the dominance of steel slag content more clearly. In the RF and GB models, its importance is significantly higher than that of the second-ranked variable. This reflects the strong capability of ensemble learning models in identifying dominant controlling factors. In contrast, KNN and SVR produce more uniform importance distributions. Except for steel slag content, other variables show smaller differences in contribution. This suggests that distance-based and kernel-based models emphasize the combined effects of multiple variables. These differences reflect the distinct mechanisms through which different algorithms interpret thermal conduction behavior [14,43]. The feature importance analysis confirms the dominant role of steel slag content from a data-driven perspective. It also clarifies the auxiliary contributions of volumetric parameters and gradation characteristics. These findings provide a clear theoretical basis and practical guidance for the targeted design and optimization of steel slag asphalt mixtures with enhanced thermal conductivity [33].
As shown in Figure 13, the feature importance distributions obtained from different machine learning models exhibit a highly consistent overall pattern. At the same time, certain model-specific differences in weight allocation can be observed. Steel slag content shows the highest importance in all models. Its weight ranges from approximately 0.37 to 0.44. This value is significantly higher than those of all other input variables. These results clearly indicate the dominant role of steel slag content in thermal conductivity prediction. For the KNN and SVR models, the importance of steel slag content reaches about 0.42 and 0.44, respectively. This reflects the high sensitivity of these models to variations in steel slag content. From a data-driven perspective, this finding further confirms that steel slag, as a high-thermal-conductivity aggregate, directly controls the formation of internal heat conduction pathways in the mixture. It therefore acts as the primary factor governing thermal conductivity [14,33].
Among the secondary influencing factors, voids filled with asphalt (VFA) and air void content consistently rank in the second tier across all five models. Their importance values are generally distributed between 0.07 and 0.10. Compared with other structural parameters, these two variables show relatively stable rankings among different models. This indicates that volumetric–structural characteristics play a non-negligible role in thermal performance. VFA represents the degree of asphalt filling within the aggregate skeleton. Air void content directly reflects the proportion of the air phase in the mixture. Since air has a much lower thermal conductivity than mineral aggregates and steel slag, higher air void content disrupts the continuity of heat transfer paths. This leads to a reduction in overall thermal conductivity [33,44]. The moderate-to-high importance assigned to air voids is therefore physically reasonable.
The remaining variables, including fine aggregate ratio, coarse aggregate ratio, water absorption of steel slag, density of steel slag, intrinsic thermal conductivity of steel slag, and asphalt content, generally exhibit importance values concentrated between 0.055 and 0.070. Only minor variations are observed among these parameters. This suggests that, when steel slag type and material properties remain relatively stable, these variables mainly exert indirect or synergistic effects. They do not serve as dominant controlling factors. For example, the intrinsic thermal conductivity and density of steel slag are consistently ranked lower. This implies that their influence on mixture thermal conductivity is primarily expressed through steel slag content rather than through small variations in individual material properties [33,34,37].
Clear differences can also be observed among the modeling approaches. Tree-based and ensemble models, including DT, RF, and GB, show more concentrated importance distributions. Steel slag content is assigned a significantly higher weight than other variables. This reflects the strong capability of these models to identify dominant controlling factors. In contrast, KNN and SVR produce smoother importance distributions. These models emphasize the combined effects of multiple variables. Despite these differences, the consistent ranking of key variables across all models reinforces the robustness and physical plausibility of the feature importance analysis [33,34].
In summary, the feature importance heatmap clearly demonstrates that steel slag content is the decisive factor influencing thermal conductivity. Volumetric–structural parameters play a secondary regulatory role. Material properties and gradation-related parameters mainly act as auxiliary factors. These conclusions provide strong data-driven evidence and clear engineering insight.

4.3. Economic and Environmental Benefit Analysis Based on Thermal Conductivity

4.3.1. Economic Benefit Analysis

To convert thermal performance into an engineering-relevant assessment of economic benefits, this section uses the mixture thermal conductivity k’mix as an indicator of heat-transfer capability. An Economic Efficiency Index is then defined. The thermal conductivity k’mix is mainly determined by the steel slag content and the volumetric structure parameters. It reflects the differences in thermal conductivity among mixture designs. The economic benefit comes from two aspects. First, higher thermal conductivity can reduce the operating energy required to achieve the same pavement surface thermal effect. It can therefore yield operating cost savings. Second, variations in asphalt content among mixture designs change the material cost. This change affects the overall economic performance.
(1)
Regression equation for thermal conductivity
k mix = 1.44 + 0.005365 C S S 0.03772 V v + 0.001581 R c
here, CSS denotes the steel slag content (%), Vv denotes the air voids (%), and Rc denotes the coarse aggregate ratio (%). The unit of k’mix is W/(m·K). This regression equation captures the main influence trends on thermal conductivity. Increasing CSS and Rc generally improves thermal conductivity. In contrast, increasing Vv reduces thermal conductivity.
(2)
Economic efficiency index equation
E c o n o m i c   e f f i c i e n c y   i n d e x = 1 0.3681 × k min k mix + 0.6319 × C A C C A C min
here, CAC is the asphalt content (%). The reference value kmin is defined as the minimum predicted thermal conductivity among the 200 samples, and CACmin is defined as the minimum asphalt content among the 200 samples. The weights 0.3681 and 0.6319 are obtained objectively from the 200 samples. When thermal conductivity is improved and the asphalt content is properly controlled, the economic efficiency index increases, which indicates better overall economic performance. The weights for the economic index were derived using the entropy weight method, which assigns weights based on the variability of each input indicator. The method and input indicators used in the weighting process will be described to ensure the reproducibility of the economic index calculation.
Figure 14 shows the distribution characteristics of the economic benefit index in the dimensions of thermal performance and asphalt content. Figure 14a presents a two-dimensional kernel density map of the economic benefit index versus the thermal conductivity of steel slag asphalt mixtures. The color changes from dark to light as the sample density increases. The economic benefit index generally increases with increasing k’mix. This result indicates that higher thermal conductivity is usually associated with higher economic benefits. This trend is consistent with the logic that improved heat transfer reduces energy demand and lowers operating cost. The density peak also shows that most mixtures are concentrated in a moderate k’mix range. The economic benefit index is relatively clustered in this range. Figure 14b shows the two-dimensional density distribution of the economic benefit index versus asphalt content. The economic benefit index decreases as CAC increases. This result suggests that higher asphalt content is unfavorable for economic benefits. The oblique band in the density map indicates that samples tend to cluster in the region with lower CAC and higher economic benefits. Figure 14c further illustrates the coupled relationship among k’mix, CAC, and the economic benefit index using a three-dimensional scatter plot. High-value points mainly appear at higher k’mix and lower CAC. This pattern reveals a typical trade-off between operating gains from improved thermal conductivity and material cost increases caused by higher asphalt content. These results provide support for mixture design optimization and multi-objective coordinated design.

4.3.2. Environmental Benefit Analysis

From an environmental perspective, steel slag can be used as a thermally conductive filler. This use increases solid-waste recycling. It also reduces land occupation from long-term stockpiling. It can lower dust and runoff pollution around stockpiles.
Steel slag contains several metallic elements. Its environmental compatibility in pavements must be checked. This check is more important for snow-melting and deicing pavements. In these pavements, rain and meltwater wash the surface more often. Water moves more frequently between the surface and inner layers. This process can affect pollutant release and transport [45,46,47,48].
Steel-slag-based thermally conductive asphalt concrete can be environmentally feasible. This is true when material stability and construction quality are ensured. The benefits are not only waste reduction and land saving. Better heat transfer can reduce energy demand during snow melting. Lower energy demand can reduce carbon emissions. Steel slag can also replace part of natural aggregates. This replacement reduces resource extraction and transportation impacts. Overall, combined economic and environmental results support engineering application.

4.4. Engineering Implications

From an engineering application perspective, this study focuses on the prediction of thermal conductivity in steel slag asphalt mixtures. It clarifies the quantitative relationships between mixture composition parameters and thermal performance. The results provide practical guidance for the design of thermally conductive and functional asphalt pavement materials.
First, the results indicate that steel slag content, void structure, and compaction-related parameters exert a pronounced influence on thermal conductivity. This provides clear guidance for the design of thermally conductive layers in electrically heated snow-melting pavements. In engineering practice, an appropriate increase in steel slag replacement ratio is recommended. Gradation design should also be optimized to reduce air void content. These measures promote the formation of continuous heat conduction pathways. They enhance heat transfer efficiency within the pavement structure. As a result, snow-melting time can be shortened and energy consumption can be reduced. Second, the machine-learning-based thermal conductivity prediction method offers an efficient alternative to traditional material design approaches. Conventional methods often rely on extensive laboratory testing. In practical applications, predictive models can be used to rapidly evaluate different material combinations while meeting performance requirements. This reduces experimental workload and shortens development cycles. It also improves the design efficiency of thermally conductive asphalt mixtures. The proposed data-driven framework is particularly valuable for functional pavement materials with complex parameter interactions and high testing costs. In addition, the incorporation of industrial by-products such as steel slag enhances pavement functionality while promoting resource utilization. By reasonably controlling steel slag content and mixture structural parameters, thermal conductivity can be improved without compromising engineering applicability. Environmental performance can also be maintained. This supports the development of green road engineering and sustainable transportation infrastructure. The generalization capability of the proposed models is influenced by both dataset size and data heterogeneity. The relatively limited dataset requires careful feature selection, hyperparameter tuning, and cross-validation to prevent overfitting and maintain reliable performance. Meanwhile, the material-specific nature of the dataset reduces excessive heterogeneity, which enhances model stability within the target domain. However, this specificity also limits direct applicability to other asphalt mixture systems. Future work may address this limitation by expanding the dataset and including more diverse material compositions, thereby improving model robustness and broadening its applicability. Future studies should consider collecting additional features such as slag particle shape descriptors (shape factor, angularity, surface roughness), mineralogical indicators (phase composition proxies), binder viscosity/rheology parameters, and compaction/structure descriptors. Standardized thermal conductivity testing metadata would also reduce inter-study variability and improve model robustness.
The proposed thermal conductivity prediction framework provides a reliable basis for material selection and structural design in thermally conductive asphalt pavements and electrically heated snow-melting systems. It offers positive engineering significance for improving winter road safety and energy utilization efficiency.

5. Conclusions

This study developed a machine-learning-based framework to predict the thermal conductivity of steel slag asphalt mixtures. The contribution of this study is primarily associated with a material-specific research perspective rather than methodological complexity. By focusing on steel slag asphalt mixtures and adopting a material-oriented parameter system, this work provides a complementary viewpoint to existing generalized ML-based thermal conductivity prediction models. The main conclusions are as follows.
(1)
The constructed thermal conductivity database exhibits good engineering rationality and statistical stability. Input variables fall within typical engineering ranges. Correlation analysis indicates that thermal conductivity is primarily influenced by a strong linear relationship with steel slag content, as evidenced by a Pearson correlation coefficient of 0.92. However, the overall relationship is also affected by multiple factors that may introduce nonlinearities, highlighting the complexity beyond a simple linear correlation.
(2)
All five machine learning models can predict thermal conductivity effectively. However, random forest and gradient boosting outperform KNN, decision tree, and support vector regression in terms of accuracy, stability, and generalization. Both ensemble models achieve test-set R2 values above 0.83 with low RMSE and MAE.
(3)
Feature importance analysis consistently identifies steel slag content as the dominant controlling factor. Volumetric parameters, such as air voids and voids filled with asphalt, play secondary roles, while gradation and steel slag material properties mainly exert indirect effects.
(4)
From an engineering perspective, the proposed prediction approach provides an efficient supplement to traditional experimental methods. It supports rapid material selection and mixture optimization for thermally conductive and functional asphalt pavements, including electrically heated snow-melting systems.
In summary, this study demonstrates the effectiveness of machine learning for predicting the thermal conductivity of steel slag asphalt mixtures and supports the high-value utilization of steel slag in functional pavement applications. This study has several limitations, including the relatively small dataset, the material-specific scope, and the focus on thermal conductivity under particular mixture conditions. Future work will focus on expanding the dataset to cover broader material systems, incorporating additional physical descriptors, and exploring hybrid data-driven and physics-informed modeling strategies.

Author Contributions

Conceptualization, J.Z. and W.S.; methodology, Z.L. and X.C.; formal analysis, Z.L.; investigation, J.M., X.L. and S.J.; data curation, Z.L. and X.C.; writing—original draft preparation, J.Z. and Z.L.; writing—review and editing, W.S. and J.Z.; visualization, Y.C.; supervision, W.S.; project administration, Z.L.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology R&D Program of Xinjiang Transportation Investment (Group) Co., Ltd. (ZKXFWCG-202211-011): Research and Application of Comprehensive Utilization Technology for Steel Slag in Xinjiang Pavements.

Data Availability Statement

The dataset used for this study, including all input variables, the output (thermal conductivity), and references to the original literature sources, is not yet publicly available due to ongoing collaboration with industry partners. However, the full dataset can be made available upon reasonable request, subject to compliance with data-sharing protocols agreed upon with our collaborators.

Conflicts of Interest

Jiangnan Zhao, Jie Mu, Xianxu Liu, and Shasha Jiang are employees of Xinjiang Jiaotou Construction Management Co., Ltd. The authors declare that this employment did not influence the research or its outcomes.

References

  1. Cao, Y.; Sha, A.; Liu, Z.; Zhang, F.; Li, J.; Liu, H. Thermal conductivity evaluation and road performance test of steel slag asphalt mixture. Sustainability 2022, 14, 7288. [Google Scholar] [CrossRef]
  2. Syu, J.Y.; Li, Y.W.; Li, Y.F.; Huang, C.H.; Chen, S.H.; Lee, W.H. A study on thermal performance for building shell of modified basic oxygen furnace slag replacing partial concrete aggregate. Buildings 2025, 16, 108. [Google Scholar] [CrossRef]
  3. Shu, Z.; Wu, J.; Li, S.; Zhang, B.; Yang, J. Road performance, thermal conductivity, and temperature distribution of steel slag rubber asphalt surface layer. J. Renew. Mater. 2021, 9, 365–380. [Google Scholar] [CrossRef]
  4. Benavides, D.; Rangel, R.L.; Franci, A.; Aponte, D. Effect of steel slag on compaction times of asphalt mixtures based on prediction of cooling curves. Constr. Build. Mater. 2024, 421, 135550. [Google Scholar] [CrossRef]
  5. Asi, I.M.; Qasrawi, H.Y.; Shalabi, F.I. Use of steel slag aggregate in asphalt concrete mixes. Can. J. Civ. Eng. 2007, 34, 902–911. [Google Scholar] [CrossRef]
  6. Chen, J.; Chen, X.; Dan, H.; Zhang, L. Combined prediction method for thermal conductivity of asphalt concrete based on meso-structure and renormalization technology. Appl. Sci. 2022, 12, 857. [Google Scholar] [CrossRef]
  7. Yu, S.; Chen, J.; Wang, H.; Qu, Y. Computational Renormalization for Thermal Conductivity of Porous Asphalt Concrete Based on Hybrid Finite Element-Neural Network Method. Constr. Build. Mater. 2024, 450, 138725. [Google Scholar] [CrossRef]
  8. Kebede, Y.B.; Yang, M.D.; Huang, C.W. Real-Time Pavement Temperature Prediction through Ensemble Machine Learning. Eng. Appl. Artif. Intell. 2024, 135, 108870. [Google Scholar] [CrossRef]
  9. Ahmedzade, P.; Sengoz, B. Evaluation of steel slag coarse aggregate in hot mix asphalt concrete. J. Hazard. Mater. 2009, 165, 300–305. [Google Scholar] [CrossRef]
  10. Li, T.; Qian, G.; Yu, H.; Lei, R.; Ge, J.; Dai, W.; Huang, L. Research on thermal conduction of steel slag-modified asphalt mixtures considering aggregate properties. Case Stud. Constr. Mater. 2025, 23, e05354. [Google Scholar] [CrossRef]
  11. Díaz-Piloneta, M.; Terrados-Cristos, M.; Álvarez-Cabal, J.V.; Vergara-González, E. Comprehensive analysis of steel slag as aggregate for road construction: Experimental testing and environmental impact assessment. Materials 2021, 14, 3587. [Google Scholar] [CrossRef]
  12. Hassan, K.E.; Attia, M.I.; Reid, M.; Al-Kuwari, M.B. Performance of steel slag aggregate in asphalt mixtures in a hot desert climate. Case Stud. Constr. Mater. 2021, 14, e00534. [Google Scholar] [CrossRef]
  13. Jiao, W.; Sha, A.; Liu, Z.; Jiang, W.; Hu, L.; Li, X. Utilization of steel slags to produce thermal conductive asphalt concretes for snow melting pavements. J. Clean. Prod. 2020, 261, 121197. [Google Scholar] [CrossRef]
  14. Sargam, Y.; Wang, K.; Cho, I.H. Machine learning based prediction model for thermal conductivity of concrete. J. Build. Eng. 2021, 34, 101956. [Google Scholar] [CrossRef]
  15. Zahoor, M.F.; Hussain, A.; Khattak, A. Machine learning-based prediction performance comparison of Marshall stability and flow in asphalt mixtures. Infrastructures 2025, 10, 142. [Google Scholar] [CrossRef]
  16. Vargas, C.; Hanandeh, A.E. Feature importance and their impacts on the properties of asphalt mixture modified with plastic waste: A machine learning modeling approach. Int. J. Pavement Res. Technol. 2023, 16, 1555–1582. [Google Scholar] [CrossRef]
  17. Leukel, J.; Scheurer, L.; Sugumaran, V. Machine learning models for predicting physical properties in asphalt road construction: A systematic review. Constr. Build. Mater. 2024, 440, 137397. [Google Scholar] [CrossRef]
  18. Khiavi, A.J.; Naeim, B.; Soleimanzadeh, M. Development of a novel ensemble learning model for predicting asphalt volumetric properties using experimental data for pavement performance assessment. Sci. Rep. 2025, 15, 40649. [Google Scholar] [CrossRef]
  19. Fan, X.; Lv, S.; Xia, C.; Ge, D.; Liu, C.; Lu, W. Strength prediction of asphalt mixture under interactive conditions based on BPNN and SVM. Case Stud. Constr. Mater. 2024, 21, e03489. [Google Scholar] [CrossRef]
  20. Yousafzai, A.K.; Sutanto, M.H.; Khan, N.; Wahab, M.M.A.; Khan, M.I.; Abubakar, A.S.; Al-Nawasir, R. Performance prediction of waste tire metal fiber-modified asphalt mixes using a decision tree machine learning technique. J. Hunan Univ. Nat. Sci. 2024, 51, 29–43. [Google Scholar] [CrossRef]
  21. Kavussi, A.; Jalili Qazizadeh, M.; Hassani, A. Fatigue behavior analysis of asphalt mixes containing electric arc furnace steel slag. J. Rehabil. Civ. Eng. 2015, 3, 74–86. [Google Scholar]
  22. Martinho, F.M.A. Exploring the High-Temperature Synthesis of Thin-Film Solar Absorbers on Silicon for Monolithic Tandem Solar Energy Conversion Devices. Ph.D. Thesis, Technical University of Denmark, Copenhagen, Denmark, 2020. [Google Scholar]
  23. Huang, B.; Chen, X.; Shu, X. Effects of electrically conductive additives on laboratory-measured properties of asphalt mixtures. J. Mater. Civ. Eng. 2009, 21, 612–617. [Google Scholar] [CrossRef]
  24. Rohsenow, W.M.; James, P.H.; Young, I.C. Handbook of Heat Transfer; Mcgraw-Hill: New York, NY, USA, 1998; Volume 3. [Google Scholar]
  25. Makarova, V.V.; Gorbacheva, S.N.; Antonov, S.V.; Ilyin, S.O. On the possibility of a radical increase in thermal conductivity by dispersed particles. Russ. J. Appl. Chem. 2020, 93, 1796–1814. [Google Scholar] [CrossRef]
  26. Ameri, M.; Hesami, S.; Goli, H. Laboratory evaluation of warm mix asphalt mixtures containing electric arc furnace steel slag. Constr. Build. Mater. 2013, 49, 611–617. [Google Scholar] [CrossRef]
  27. Rahman, S.; Bhasin, A.; Smit, A. Exploring the use of machine learning to predict metrics related to asphalt mixture performance. Constr. Build. Mater. 2021, 295, 123585. [Google Scholar] [CrossRef]
  28. Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995; pp. 1137–1145. [Google Scholar]
  29. Rodriguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef]
  30. Hastie, T.; Tibshirani, R.; Friedman, J. Model assessment and selection. In The Elements of Statistical Learning; Springer: New York, NY, USA, 2008; pp. 219–259. [Google Scholar]
  31. Cawley, G.C.; Talbot, N.L. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107. [Google Scholar]
  32. Shawon, A.R.; Ghosh, R.; Islam, M.A. Analysis of thermal conductivity of aluminum alloys by compositions and tempering process using machine learning. Sci. Rep. 2025, 15, 33352. [Google Scholar] [CrossRef]
  33. Mohtadi, A.; Ghomeishi, M.; Dehghanbanadaki, A. Machine learning-based prediction of thermal conductivity in foamed concrete: Influence of nano-microsilica compounds and air content using XGBoost. Ain Shams Eng. J. 2025, 16, 103833. [Google Scholar] [CrossRef]
  34. Kang, G.O.; Kim, Y.S.; Kang, J.G.; Chang, S. Machine learning-based prediction of the thermal conductivity of filling material incorporating steelmaking slag in a ground heat exchanger system. Sci. Rep. 2025, 15, 12409. [Google Scholar] [CrossRef]
  35. Yun, J.; Park, J.; Choo, H.; Lee, H.M.; Won, J. Data-driven machine learning models for predicting engineering properties in deep-sea sediments. Sci. Rep. 2025, 15, 44987. [Google Scholar] [CrossRef] [PubMed]
  36. Liu, Y.; Zheng, W.; Ai, H.; Zhou, H.; Feng, L.; Cheng, L.; Song, X. Application of machine learning in predicting the thermal conductivity of single-filler polymer composites. Mater. Today Commun. 2024, 39, 109116. [Google Scholar] [CrossRef]
  37. Sah, P.K.; Saurav; Singh, S.K.; Kumar, S.S. Assessment of tree-based machine learning models and empirical equations to predict thermal conductivity of bentonite-based backfill material. Environ. Earth Sci. 2025, 84, 379. [Google Scholar] [CrossRef]
  38. Yuan, C.; Shi, Y.; Ba, Z.; Liang, D.; Wang, J.; Liu, X.; Xu, H. Machine learning models for predicting thermal properties of radiative cooling aerogels. Gels 2025, 11, 70. [Google Scholar] [CrossRef]
  39. Ghasemi, A.; Barisik, M. Machine learning for thermal transport prediction in nanoporous materials: Progress, challenges, and opportunities. Nanomaterials 2025, 15, 1660. [Google Scholar] [CrossRef]
  40. Han, J.; Kim, I.; Cho, N.; Yang, K.S.; Myung, J.S.; Park, J.; Choi, W.J. Toward accurate machine learning-driven prediction of polymeric composites properties based on experimental data. Mater. Genome Eng. Adv. 2025, 3, e70027. [Google Scholar] [CrossRef]
  41. Rahman, A.A.; Wang, B.; Yu, J.; Gao, Y.; He, Y.; Jin, T.; Gan, Z. Machine learning approaches for predicting thermal performance of multilayer insulation materials at low temperatures. Appl. Therm. Eng. 2025, 264, 125527. [Google Scholar] [CrossRef]
  42. Sharma, H.; Arora, G.; Singh, M.K.; Ayyappan, V.; Bhowmik, P.; Rangappa, S.M.; Siengchin, S. Review of machine learning approaches for predicting mechanical behavior of composite materials. Discov. Appl. Sci. 2025, 7, 1238. [Google Scholar] [CrossRef]
  43. Al-mahmodi, A.F.; Munusamy, Y.; Atta, M.R. Predictive modeling of thermal conductivity in PCM composites using artificial intelligence. Sustain. Chem. Clim. Action 2025, 7, 100145. [Google Scholar] [CrossRef]
  44. Yu, Y.; Jin, Z.X.; Li, J.Z.; Jia, L. Low-carbon development path of China’s power industry based on synergistic emission reduction. J. Clean. Prod. 2020, 275, 123097. [Google Scholar] [CrossRef]
  45. Hu, R.; Xie, J.; Wu, S.; Yang, C.; Yang, D. Study of toxicity assessment of heavy metals from steel slag and its asphalt mixture. Materials 2020, 13, 2768. [Google Scholar] [CrossRef]
  46. Gan, Y.; Li, C.; Zou, J.; Wang, W.; Yu, T. Evaluation of the impact factors on the leaching risk of steel slag and its asphalt mixture. Case Stud. Constr. Mater. 2022, 16, e01067. [Google Scholar] [CrossRef]
  47. Salas Echezarreta, I.; Cifrián Bemposta, E.; Lastra González, P.; Castro Fresno, D.; Andrés Payán, A. Assessment of dynamic surface leaching of asphalt mixtures incorporating electric arc furnace steel slag as aggregate for sustainable road construction. Materials 2025, 17, 3737. [Google Scholar]
  48. Milačič, R.; Zuliani, T.; Oblak, T.; Mladenovič, A.; Ančar, J.Š. Environmental impacts of asphalt mixes with electric arc furnace steel slag. J. Environ. Qual. 2011, 40, 1153–1161. [Google Scholar] [CrossRef]
Figure 1. Predicting thermal conductivity of steel slag asphalt mixtures using machine learning models.
Figure 1. Predicting thermal conductivity of steel slag asphalt mixtures using machine learning models.
Processes 14 00689 g001
Figure 2. Statistical distributions of input features and thermal conductivity for steel slag asphalt mixtures (ak).
Figure 2. Statistical distributions of input features and thermal conductivity for steel slag asphalt mixtures (ak).
Processes 14 00689 g002aProcesses 14 00689 g002b
Figure 3. Comparative analysis of parameter correlations in steel slag asphalt mixtures (a) Pearson; (b) Kendall.
Figure 3. Comparative analysis of parameter correlations in steel slag asphalt mixtures (a) Pearson; (b) Kendall.
Processes 14 00689 g003aProcesses 14 00689 g003b
Figure 4. Prediction workflow of the random forest model.
Figure 4. Prediction workflow of the random forest model.
Processes 14 00689 g004
Figure 5. Prediction workflow of the support vector regression model.
Figure 5. Prediction workflow of the support vector regression model.
Processes 14 00689 g005
Figure 6. Prediction workflow of the gradient boosting model.
Figure 6. Prediction workflow of the gradient boosting model.
Processes 14 00689 g006
Figure 7. Flowchart of model training, cross-validation, and performance evaluation.
Figure 7. Flowchart of model training, cross-validation, and performance evaluation.
Processes 14 00689 g007
Figure 8. Framework of machine learning–based thermal conductivity prediction for steel slag asphalt mixtures.
Figure 8. Framework of machine learning–based thermal conductivity prediction for steel slag asphalt mixtures.
Processes 14 00689 g008
Figure 9. Comparison of thermal conductivity predictions for steel slag asphalt mixtures (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Figure 9. Comparison of thermal conductivity predictions for steel slag asphalt mixtures (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Processes 14 00689 g009aProcesses 14 00689 g009b
Figure 10. Comparative analysis of thermal conductivity predictions and error distributions across different machine learning model (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Figure 10. Comparative analysis of thermal conductivity predictions and error distributions across different machine learning model (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Processes 14 00689 g010aProcesses 14 00689 g010bProcesses 14 00689 g010c
Figure 11. Performance metrics of thermal conductivity prediction models for steel slag asphalt mixtures (a) R2; (b) RMSE; (c) MAE; (d) MAPE.
Figure 11. Performance metrics of thermal conductivity prediction models for steel slag asphalt mixtures (a) R2; (b) RMSE; (c) MAE; (d) MAPE.
Processes 14 00689 g011
Figure 12. Feature importance for thermal conductivity prediction of steel slag asphalt mixtures (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Figure 12. Feature importance for thermal conductivity prediction of steel slag asphalt mixtures (a) KNN; (b) DT; (c) RF; (d) SVR; (e) GB.
Processes 14 00689 g012aProcesses 14 00689 g012b
Figure 13. Heatmap of feature importance for thermal conductivity prediction of steel slag asphalt mixtures.
Figure 13. Heatmap of feature importance for thermal conductivity prediction of steel slag asphalt mixtures.
Processes 14 00689 g013
Figure 14. Distribution characteristics of the economic benefit index with respect to the thermal conductivity and asphalt content of steel-slag asphalt mixtures. (a) Economic benefit index–k’mix density landscape; (b) Economic benefit index–CAC density landscape; (c) Trade-off pattern between k’mix and CAC in economic benefit index.
Figure 14. Distribution characteristics of the economic benefit index with respect to the thermal conductivity and asphalt content of steel-slag asphalt mixtures. (a) Economic benefit index–k’mix density landscape; (b) Economic benefit index–CAC density landscape; (c) Trade-off pattern between k’mix and CAC in economic benefit index.
Processes 14 00689 g014
Table 1. Variables for predicting the thermal conductivity of steel slag asphalt mixtures.
Table 1. Variables for predicting the thermal conductivity of steel slag asphalt mixtures.
VariablesAbbreviationUnitDescription
Steel slag contentCSS%Mass percentage of steel slag in the aggregate system, reflecting the proportion of high thermal conductivity phases in the mixture
Asphalt contentCAC%Mass percentage of asphalt binder in the mixture.
Air voidsVv%Volume fraction of gas phase in asphalt mixture, which significantly affects the continuity of thermal conduction pathways.
Voids in mineral aggregateVMA%Proportion of effective pore volume between aggregate skeleton particles.
Voids filled with asphaltVFA%Volume ratio of voids in mineral aggregate filled with asphalt.
Coarse aggregate ratioRc%Proportion of coarse aggregate in the aggregate system, reflecting the skeleton structure and particle contact network characteristics.
Fine aggregate ratioRf%Proportion of fine aggregate in the aggregate system.
Apparent density of steel slagρSSg/cm3Apparent density of steel slag aggregate, indicating material compactness and particle contact efficiency.
Water absorption of steel slagWSS%Key indicator reflecting the internal pore structure characteristics of steel slag.
Thermal conductivity of steel slagkSSW/(m·K)Intrinsic thermal property parameter of steel slag particle material.
Thermal conductivity of steel slag asphalt mixturekmixW/(m·K)Equivalent thermal conductivity of steel slag asphalt mixture.
Table 2. Descriptive statistical results.
Table 2. Descriptive statistical results.
VariableMeanStdMinimumQ1 (25%)MedianQ3 (75%)MaximumVarianceSkewnessKurtosis
CSS23.8955515.029090.0510.14523.4536.502554.88225.87360.17453−1.04464
CAC5.18920.2396034.84.975.195.39255.60.0574090.065167−1.29545
Vv4.17240.5418373.213.7254.1954.664.980.293587−0.15562−1.25943
VMA15.80770.72318914.5115.21515.78516.41170.5230030.003058−1.19512
VFA73.92422.3734247071.9873.8475.852577.985.6331410.067513−1.15781
Rc55.720053.58760250.0152.60555.54559.302561.9412.870890.058187−1.29924
Rf44.279953.58760238.0640.697544.45547.39549.9912.87089−0.05819−1.29924
ρSS3.443450.0852463.313.36753.453.523.60.0072670.085522−1.22843
WSS2.500750.28814822.282.472.74252.990.083029−0.01747−1.13796
kSS2.50920.2317892.12.312.5152.72252.890.053726−0.08084−1.29676
kmix1.5038750.0879631.3131.441.5011.5721.7190.0077370.102302−0.74653
Table 3. Parameters for evaluating model performance.
Table 3. Parameters for evaluating model performance.
ParametersEquation
R2 R 2 = 1 i = 1 n ( a ^ i a i ) 2 i = 1 n ( a i a ¯ ) 2 (1)
RMSE R M S E = i = 1 n ( a ^ i a i ) 2 n (2)
MAPE M A P E = 100 % n i = 1 n a ^ i a i a i (3)
MAE M A E = i = 1 n a ^ i a i n (4)
Where a i is the i-th original data value, a ^ i is the corresponding predicted value, and n is the total number of data points.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, J.; Sun, W.; Liu, Z.; Mu, J.; Cui, X.; Liu, X.; Jiang, S.; Chao, Y. Ensemble-Based Material-Specific Prediction of Thermal Conductivity for Steel Slag Asphalt Mixtures. Processes 2026, 14, 689. https://doi.org/10.3390/pr14040689

AMA Style

Zhao J, Sun W, Liu Z, Mu J, Cui X, Liu X, Jiang S, Chao Y. Ensemble-Based Material-Specific Prediction of Thermal Conductivity for Steel Slag Asphalt Mixtures. Processes. 2026; 14(4):689. https://doi.org/10.3390/pr14040689

Chicago/Turabian Style

Zhao, Jiangnan, Wangwen Sun, Zhuangzhuang Liu, Jie Mu, Xinshuo Cui, Xianxu Liu, Shasha Jiang, and Yuhao Chao. 2026. "Ensemble-Based Material-Specific Prediction of Thermal Conductivity for Steel Slag Asphalt Mixtures" Processes 14, no. 4: 689. https://doi.org/10.3390/pr14040689

APA Style

Zhao, J., Sun, W., Liu, Z., Mu, J., Cui, X., Liu, X., Jiang, S., & Chao, Y. (2026). Ensemble-Based Material-Specific Prediction of Thermal Conductivity for Steel Slag Asphalt Mixtures. Processes, 14(4), 689. https://doi.org/10.3390/pr14040689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop