1. Introduction
The durability of concrete infrastructure is a global economic and safety-related issue, with degradation mechanisms such as steel corrosion resulting in billions of spending annually on repairs, replacements, and environmental impacts. Concrete resistivity has emerged as a pivotal property for assessing long-term performance, as it directly governs the rate of chloride ingress, sulfate attack, and corrosion propagation, which can result in compromised structural integrity and service life. In an era of escalating climate variability and aggressive environmental exposure, optimizing concrete resistivity is no longer a theoretical concern but has a practical importance to extend the lifespan of critical infrastructure, reduce lifecycle costs, and align with sustainability goals.
The urgency of this issue is underscored by the vulnerability of aging bridges, marine structures, and urban infrastructure to corrosion-induced failures. Resistivity acts as a first line of defense: high-resistivity concrete restricts ionic movement, slows electrochemical reactions at the steel–concrete interface, and mitigates damage from freeze–thaw cycles and chemical ingress. However, achieving optimal resistivity is not an easy task, as it depends on a complex material composition, environmental conditions, and curing techniques.
Among the most influential factors are temperature fluctuations, the water-to-cement ratio (w/c), and saturation ratio (SR), each of which introduces pronounce effects on microstructural development and ion transport. A higher water-to-cement (w/c) ratio typically results in increased porosity and lower resistivity, as the conductivity of the pore solution becomes more significant in the overall resistivity measurement [
1,
2,
3]. This occurs because a higher w/c ratio leads to a more interconnected pore network, allowing ions to move more freely and facilitating electrical conductivity. Moreover, the degree of saturation plays a crucial role in determining the resistivity of concrete. Research has shown that as saturation levels decrease, resistivity values increase, demonstrating that moisture content plays a crucial role in electrical measurements [
4,
5]. Changes in saturation levels strongly affect electrical resistivity: higher moisture content improves ion conductivity, whereas lower moisture levels restrict ionic movement and reduce the risk of corrosion. The relationship between moisture content and resistivity becomes more complex due to environmental factors such as temperature and humidity, which influence the conductivity of the pore solution and the overall resistivity of concrete [
6,
7]. For example, temperature fluctuations affect ion mobility within the pore solution, leading to variations in resistivity measurements [
8,
9]. As temperature increases, ion movement accelerates, resulting in lower resistivity in concrete.
Understanding these complex interdependencies is essential for the development of reliable models that accurately predict concrete resistivity under varying conditions. However, traditional empirical and analytical models often struggle to capture the nonlinear relationships between resistivity and its influencing factors—such as the water–cement ratio, temperature, and saturation ratio. Consequently, the accurate prediction of concrete properties has become increasingly critical for structural health monitoring and maintenance planning, underscoring the need for more advanced computational approaches.
In response to these challenges, the integration of machine learning (ML) techniques into concrete technology has revolutionized the prediction of concrete properties and structural condition assessment [
10,
11]. Traditional concrete mix design methods rely heavily on empirical formulas and extensive experimental testing, both of which can be time-consuming and costly. In contrast, machine learning offers a data-driven approach that enables more efficient and accurate predictions of concrete performance based on its constituent materials and mix proportions.
Among the various ML algorithms, linear regression is one of the most commonly used techniques for predicting concrete properties due to its simplicity and interpretability. However, this method assumes a linear relationship between input variables and the target property, which may not always be the case. Certain concrete characteristics, such as concrete resistivity, often exhibit nonlinear and complex dependencies, requiring more advanced modeling techniques.
To address this limitation, Support Vector Machines (SVMs) with a cubic kernel offer enhanced predictive capabilities by mapping input features to a higher-dimensional space, effectively capturing nonlinear relationships. Studies have demonstrated that SVM can outperform traditional regression methods by providing more accurate predictions based on input parameters such as the type and proportions of aggregates, the water–cement ratio, and curing conditions [
12,
13]. In addition, Gaussian Process Regression (GPR) has been recognized for its probabilistic nature, allowing for the estimation of uncertainty in predictions, which is crucial in the context of construction materials, where variability can significantly impact performance [
14,
15]. The adaptability of these models to different types of concrete, including self-compacting concrete and high-performance and eco-friendly variants, highlights their versatility and relevance in modern concrete technology [
16,
17,
18].
Moreover, the application of multivariate polynomial regression has shown promise in predicting the compressive strength of eco-friendly concrete mixtures, which often include recycled materials and supplementary cementitious materials [
19]. This approach not only aids in optimizing mix designs but also contributes to sustainability efforts by facilitating the use of alternative materials in concrete production. Comparative studies of these machine learning models indicate that while SVM and GPR are robust in handling complex datasets, polynomial regression provides a more interpretable framework for understanding the relationships between mix components and concrete strength [
13,
20,
21].
Despite extensive research on concrete resistivity, studies that simultaneously consider multiple influencing factors remain limited [
10,
22]. In particular, the combined effects of key variables such as the water-to-cement (w/c) ratio, saturation ratio (SR), and temperature, along with the application of advanced machine learning models, have not been fully explored. Existing approaches often focus on individual parameters, overlooking their interdependent relationships and nonlinear interactions. This gap highlights the need for a more comprehensive, data-driven approach to improve resistivity predictions.
To address this limitation, the primary objective of this study is to systematically compare the predictive performance of three different machine learning models, namely linear regression, support vector machines (SVMs), and Gaussian process regression (GPR), for the estimation concrete resistivity. This approach directly addresses the critical need to identify the most accurate and reliable method for resistivity prediction in concrete structures. Additionally, this study investigates the combined effects of the w/c ratio, SR, and temperature on concrete resistivity, providing a more complete understanding of the factors that influence concrete performance. To further enhance model interpretability, a permutation importance analysis is conducted to identify the most influential variables, offering valuable insights for corrosion prevention, durability assessment, and performance-based concrete design.
2. Materials and Methods
2.1. Materials and Mix Proportions
Ordinary Portland Cement (OPC), meeting ASTM Type-I standards, was used in this study. Distilled water, prepared in the laboratory, was used for concrete mixing. The aggregates consisted of crushed granite with a nominal maximum size of 40 mm and natural river sand. Three concrete mixtures were prepared, incorporating water-to-cement ratios of 0.40, 0.50, and 0.60. Cylindrical concrete samples (Ø100 mm × 200 mm) were cast following ASTM C192 for experimental testing. The detailed mixture design specifications are presented in
Table 1. After casting, the samples were stored in a climatic chamber for one day before being demolded. They were then placed in a water tank at room temperature and kept there until the testing date.
2.2. Experimental Procedure
The experimental program involved the preparation and testing of concrete specimens to evaluate their compressive strength, porosity, density, saturation ratios, and resistivity under varying temperatures. Cylindrical specimens (100 mm diameter and 200 mm height) and rectangular cubes (100 mm × 100 mm × 100 mm) were prepared for the various tests. All specimens were cured under water at a controlled temperature of 20 °C for 28 days before testing.
Compressive strength tests were conducted following ASTM C39 or EN 12390-3 standards. Cylindrical concrete specimens were placed in a compression testing machine and subjected to a gradual loading rate of 0.5 MPa/s until failure. The maximum load sustained by the specimen was recorded, and the compressive strength was calculated accordingly. For density and porosity measurements, the procedures outlined in ASTM C642 (Standard Test Method for Density, Absorption, and Voids in Hardened Concrete) were followed. These tests provide the information to analyze the material’s internal structure, influencing its durability, permeability, and overall performance in structural applications.
To evaluate the effects of different saturation levels (100%, 90%, 80%, 70%, and 60%) on concrete properties, five sets of specimens were prepared. Moisture content was controlled through various curing methods, including full saturation for 100% and partial drying techniques for lower saturation ratios. The mass of the specimens was measured to ensure consistency before testing for resistivity and compressive strength. Once the target mass corresponding to the desired saturation level was reached, the specimens were tightly sealed using multiple layers of adhesive aluminum foil to minimize further moisture loss. This approach was essential in stabilizing the internal moisture content during subsequent testing, ensuring that the saturation levels remained consistent over time. To verify that the target moisture levels were maintained, each specimen was weighed immediately before testing. This final mass verification step was included to prevent any deviation from the intended saturation ratio, which could otherwise compromise the accuracy of the resistivity and compressive strength measurements.
The influence of temperature on concrete properties was studied by dividing the specimens into three groups and exposing them to controlled temperature conditions of 20 °C, 40 °C, and 60 °C. To maintain consistent moisture content and prevent any changes in the designated saturation ratio, the following procedure was adopted. After homogenization, the specimens were sealed to prevent moisture loss and ensure a constant degree of saturation throughout the temperature exposure. The sealed specimens were then placed in controlled temperature chambers set to 20 °C, 40 °C, and 60 °C, respectively. The temperature was gradually increased to the target levels to avoid sudden thermal damage. To verify that the moisture content remained unchanged, the specimens were periodically weighed before and after temperature exposure, ensuring no significant mass loss due to evaporation. Once the specimens stabilized at each target temperature, resistivity measurements were conducted. This approach ensured that resistivity values were determined under controlled saturation conditions at different temperatures.
The concrete resistivity test was performed using a Wenner four-point probe (Proceq Resipod, Zurich, Switzerland). Specimens wee surface-saturated before testing to ensure reliable measurements, with excess surface water removed using a damp cloth. The probes were placed to ensure firm electrode contact, with a Wenner probe spacing of 50 cm. Measurements were taken at various locations, and average values are reported.
2.3. Model Selection for Prediction of Concrete Resistivity
2.3.1. Gaussian Process Regression (GPR)
Gaussian process regression (GPR) is a non-parametric, probabilistic model that provides not only predictions but also uncertainty estimates. One of the key strengths of GPR is its ability to capture complex, nonlinear relationships between input variables and outputs while providing probabilistic predictions with confidence intervals. Additionally, it performs well with small datasets due to its Bayesian nature.
In this study, the Matérn 5/2 kernel was used as the covariance function. This kernel offers a balance between smoothness and flexibility, making it well-suited for modeling physical phenomena such as concrete resistivity, which may exhibit moderate smoothness with localized variations due to material heterogeneity. Unlike the Radial Basis Function (RBF) kernel, which assumes infinitely smooth functions, the Matérn 5/2 kernel allows for functions that are twice differentiable. This makes it more robust to localized noise and better able to capture non-ideal behaviors in material datasets, such as those arising from pore structure variability or microcracking. The suitability and properties of Matérn kernels for physical and engineering applications are well-established in the Gaussian process literature [
23]. The mathematical representation of Matern 5/2 GPR is given by Equation (1):
where
is the mean function and
is the covariance function (kernel) that defines the relationship between input points.
2.3.2. Support Vector Machine (SVM)
Support vector machine (SVM) is a supervised learning model used for both classification and regression tasks. The cubic SVM applies a polynomial kernel of degree three to model complex, nonlinear relationships. One of the main advantages of SVM is its effectiveness in high-dimensional spaces, as well as its ability to model nonlinear relationships using kernel tricks. Additionally, it is robust to overfitting when properly tuned. However, SVM can be computationally intensive, especially for large datasets, and requires careful selection of kernel parameters and regularization to achieve optimal results. The mathematical formulation for an SVM regression model is expressed as follows:
where
is the kernel function,
represents the support vector coefficients, and
b is the bias term.
2.3.3. Linear Regression
Linear regression is a fundamental statistical model that assumes a linear relationship between input variables and the target variable. It minimizes the residual sum of squares to find the best-fitting line. Linear regression is widely used due to its simplicity, ease of interpretation, and minimal computational requirements. It performs well for problems with linear relationships but struggles with nonlinearity. Additionally, it is highly sensitive to outliers and assumes that input variables are independent without interactions, which may not always be the case in real-world scenarios. The equation for a multiple linear regression model is expressed as follows:
where
y is the dependent variable,
are the independent variables (water–cement ratio, temperature, and SR),
β represents the coefficients, and ∊ is the error term.
2.3.4. Performance Indicators for Model Evaluation
Evaluating the performance of predictive models requires multiple statistical metrics to quantify error, correlation, and accuracy. The commonly used performance indicators include the Root Mean Square Error (RMSE), Coefficient of Determination (R2), Mean Squared Error (MSE), and Mean Absolute Error (MAE). Each metric provides insight into different aspects of model performance.
Root mean square error (RMSE) measures the average magnitude of prediction errors, giving higher weight to large errors. It is calculated as follows:
where represents
y represents the actual values,
ŷ represents the predicted values, and
n is the number of observations. Lower RMSE values indicate better model accuracy.
The coefficient of determination (R
2) represents the proportion of variance in the dependent variable explained by the independent variables. It is given expressed as follows:
where R
2 is the mean of the actual values. A value close to 1 indicates a strong predictive relationship.
Mean squared error (MSE) quantifies the average squared difference between predicted and actual values, penalizing large errors more heavily. It is defined as follows:
Lower MSE values indicate higher predictive accuracy.
Mean absolute error (MAE) provides an average of absolute differences between predicted and actual values, making it less sensitive to outliers than RMSE. It is computed as follows:
MAE provides an easily interpretable measure of prediction error, with lower values signifying better performance. These performance indicators collectively help assess the reliability and efficiency of different predictive models in capturing complex relationships within the dataset.
3. Results
3.1. Effect of Water-to-Cement Ratio on Concrete Porosity and Density
The water-to-cement (w/c) ratio significantly influences both the porosity and density of concrete, as shown in
Figure 1. As the w/c ratio increases, porosity increases while density decreases, indicating a direct correlation between water content and void formation in the concrete matrix. This result is consistent with many published results [
24,
25]. At w/c = 0.4, the porosity is 14.83%, and the density is 2372.00 kg/m
3. When the w/c ratio increases to 0.5, porosity rises to 23.33%, and density drops to 2316.00 kg/m
3. At w/c = 0.6, the highest porosity of 27.03% is observed, while the density further decreases to 2261.33 kg/m
3. These trends show the influence of the w/c ratio on the formation of pore networks and the overall compactness of the hardened concrete matrix. The inverse correlation between the w/c ratio and density can be attributed to the excess water introduced into the mixture at higher ratios. During hydration, unbound water evaporates, leaving behind capillary pores and air voids that reduce the solid volume fraction of the concrete. For instance, a 25% increase in the w/c ratio (0.4 to 0.5) resulted in a 57% rise in porosity (14.83% to 23.33%) and a 2.4% reduction in density. A further increase to w/c = 0.6 amplified porosity by an additional 16% (23.33% to 27.03%) and decreased density by 2.3%, indicating that the rate of porosity growth and density loss becomes more pronounced at higher w/c ratios.
This trend is explained by the role of water in concrete hydration and pore structure development. At low w/c ratios, there is just enough water for hydration, leading to a compact microstructure with minimal voids, resulting in lower porosity and higher density. However, as the w/c ratio increases, excess water remains unbound after hydration, creating more capillary pores. These additional voids increase porosity, reducing the compactness of the concrete and leading to a lower density. The increase in porosity negatively impacts the mechanical properties and durability of the concrete, as a more porous structure allows for greater penetration of aggressive agents such as water and chemicals, potentially accelerating degradation. The reduction in density with higher w/c ratios is directly linked to increased porosity. A denser concrete matrix ensures better mechanical performance and durability, while highly porous concrete is more prone to cracking, permeability issues, and lower overall strength. Therefore, optimizing the w/c ratio is crucial in mix design to achieve the desired balance between workability, durability, and mechanical properties.
3.2. Effect of Water-to-Cement Ratio on Concrete Strength
The water-to-cement (w/c) ratio has a significant influence on the compressive strength of concrete, as demonstrated in
Figure 2. At w/c = 0.4, the compressive strength (fc) at 28 days is 43.10 MPa, while at w/c = 0.5, the strength decreases to 32.77 MPa and at w/c = 0.6, it further reduces to 21.30 MPa, highlighting the critical role of the w/c ratio in the mechanical performance of cementitious materials. This trend is expected and well-documented, as a lower w/c ratio results in a denser and stronger concrete matrix, whereas a higher w/c ratio increases porosity, reducing strength. The reduction in strength with higher w/c ratios can be attributed to the microstructural evolution of the cement paste [
26,
27]. At lower w/c ratios (e.g., 0.4), the limited water content results in a denser matrix with reduced porosity, as the cement particles are closely packed and hydration products fill the interstitial spaces effectively. Conversely, higher w/c ratios introduce excess water that evaporates during curing, leaving behind capillary pores and voids.
3.3. Effect of Saturation and Temperature on Concrete Resistivity
For a water–cement (w/c) ratio of 0.4, the resistivity behavior is influenced by the dense microstructure of the concrete, which limits pore connectivity and moisture retention as shown in
Figure 3. At low temperatures, particularly 20 °C, high saturation ratios (SRs) around 89.7% correspond to minimal resistivity values (0–50 Ω.m), as the tightly packed matrix retains sufficient moisture to facilitate ionic conduction. However, as the temperature increases to 60 °C, the resistivity sharply rises to 300–350 Ω.m due to moisture evaporation and microcracking from thermal expansion. Despite its low permeability, this mix exhibits significant resistivity spikes when subjected to thermal stress, necessitating proper thermal management in high-temperature applications such as industrial or solar concrete structures. Additionally, at low saturation ratio (SR) values, the resistivity of concrete remains high due to reduced ionic conductivity within the pore network. This occurs because the limited availability of free water in the partially saturated pores significantly restricts the formation of continuous conductive pathways, thereby limiting overall ionic transport. In this state, both the lower ion concentration and the increased tortuosity of the pore structure contribute to higher resistivity. However, an interesting trend emerges as the temperature increases. For example, at an SR of 61.1%, the resistivity decreases from 308.97 Ω·m at 20 °C to 135.50 Ω·m at 60 °C. This significant reduction suggests that higher temperatures enhance ion mobility in the pore solution, partially offsetting the resistivity increase typically associated with lower saturation. This effect is primarily due to the thermal activation of ions, which increases their kinetic energy, reduces the viscosity of the pore fluid, and facilitates faster ionic transport. These combined effects allow for greater ionic conductivity, even at moderate saturation levels, highlighting the critical role of temperature in modulating the electrical properties of concrete.
At a w/c ratio of 0.5, the resistivity values are generally lower than those at 0.4, indicating a more porous microstructure that facilitates higher moisture retention and increased ionic conductivity, as shown in
Figure 4. At 20 °C, when the SR is high (90.2%), resistivity remains low (0–20 Ω.m). However, as the temperature increases to 60 °C and the SR drops to 50.1%, resistivity rises significantly, reaching 120–160 Ω.m. This mix presents a balance between porosity and resistivity sensitivity, where interconnected capillary pores accelerate moisture loss, but retained water in smaller pores helps sustain conductivity. While concrete with a w/c ratio of 0.5 is suitable for environments with moderate temperature fluctuations, its sensitivity to SR reductions could pose durability risks under uncontrolled conditions. Notably, resistivity at 20 °C and 99.6% SR is 18.33 Ω·m, which is significantly lower than the 32.67 Ω·m observed in concrete with a w/c ratio of 0.4, confirming that increased w/c ratios lead to lower resistivity due to enhanced moisture retention. The overall resistivity trends indicate that moisture content and temperature are dominant factors in determining concrete’s electrical properties.
At a w/c ratio of 0.6, the concrete shows the most significant resistivity variations, attributed to its highly porous structure as shown in
Figure 5. At 20 °C, when the SR reaches 89.2%, resistivity remains low (0–20 Ω.m) because of extensive pore saturation, which facilitates ionic mobility. However, at 60 °C, the SR significantly declines to 61.4%, leading to a surge in resistivity to 80–100 Ω.m. The open capillary network in concrete with a high w/c ratio promotes rapid moisture evaporation, drastically reducing conductivity and magnifying the correlation between the SR and resistivity. This characteristic makes electrical resistivity measurements a reliable indicator of internal moisture levels in such mixes. However, the high porosity also raises durability concerns, particularly in humid or chloride-rich environments where low resistivity at a high SR (e.g., 20 °C) signals a greater risk of corrosion.
Overall, the data highlights the inverse relationship between the SR and resistivity, with resistivity increasing as the SR decreases due to reduced pore water content and limited ionic mobility [
1,
28,
29,
30]. This trend underscores the crucial role of moisture availability in governing concrete resistivity, as the presence of water facilitates ion transport, directly impacting conductivity. When the SR is low, the interconnected pore network within the concrete contains less water, reducing the number of charge carriers and thereby increasing resistivity.
The influence of temperature is also evident, as higher temperatures lower resistivity by enhancing ion movement [
31,
32]. At elevated temperatures, the kinetic energy of ions increases, reducing the viscosity of the pore solution and improving ionic conductivity. This effect is particularly important in field applications, where seasonal variations and environmental conditions can significantly alter resistivity measurements, potentially affecting the reliability of NDT assessments for concrete structures. According to the Arrhenius law, ionic conductivity increases exponentially with temperature, assuming sufficient pore water remains to enable transport. Therefore, at temperatures above 60 °C, further enhancement in ionic mobility is theoretically expected. However, this may be counteracted by severe moisture loss and thermal damage, which can lead to increased resistivity. Conversely, at temperatures below 20 °C, reduced thermal energy and higher pore fluid viscosity may suppress ionic movement, thereby increasing resistivity. These trends extend the applicability of our findings beyond the tested laboratory range and provide useful insights for evaluating concrete durability in diverse climate conditions.
Additionally, the w/c ratio plays a key role in determining concrete resistivity by affecting the microstructural density. A lower w/c ratio results in a denser concrete matrix with reduced porosity, which restricts the movement of ions and leads to higher resistivity [
33,
34]. This is because a tightly packed microstructure reduces the connectivity of the pore network, making it more difficult for electrical charges to travel through the material. Conversely, a higher w/c ratio increases porosity, allowing for greater ion mobility and thereby lowering resistivity. In particular, higher w/c ratios lead to a broader and coarser pore size distribution, increasing the proportion of large, interconnected capillary pores that facilitate ionic transport. However, the effect of a high w/c ratio is not solely limited to porosity; it also influences the composition of the pore solution. A more porous matrix retains a higher volume of pore water, which can dissolve various ions, further enhancing electrical conductivity.
Therefore, the relationship between the w/c ratio and resistivity is governed by both the physical characteristics of the concrete matrix and the properties of the pore solution, illustrating the complex interplay between microstructure, moisture content, and electrical behavior. This understanding is essential for assessing the durability of concrete structures under different environmental conditions and is particularly valuable for NDT applications, where resistivity measurements are often used to evaluate material properties, predict long-term performance, and detect potential deterioration.
3.4. Performance Evaluation of Machine Learning Models for Concrete Resistivity Prediction
3.4.1. Mathematical Formulation of Prediction Models
The Gaussian process regression (GPR) model using the Matérn 5/2 kernel was developed in MATLAB (version R2021a) to predict concrete resistivity based on three independent variables: the water–cement ratio, temperature, and SR. This model consists of a mean function, a kernel (covariance) function, and a noise-handling component, which, together, define the relationship between inputs and the predicted output. Given the trained GPR model, the predicted mean (
μ(x∗)) and variance (
σ2(x∗)) for a new test point (
x∗) are computed as follows:
where
K is the covariance matrix,
K∗ represents the covariance between the training and test points, and
accounts for observation noise.
A constant mean function for this model is assumed:
The Matérn 5/2 kernel function is used, which is defined as follows:
where
d is the Euclidean distance between input points. This function determines the smoothness of the predictions and the relationship between data points.
To account for noise in the data, the covariance function incorporates a noise term, resulting in the following modified covariance matrix.
where
= 10.7792 (noise variance) and
I is the identity matrix. This noise term helps improve the model’s robustness by preventing overfitting and handling uncertainties in measurements.
3.4.2. Support Vector Machine (SVM) Prediction Formula
For this model, a cubic polynomial kernel of degree 3 is used. The kernel function is expressed as follows:
where:
The kernel scale value is substituted as follows:
3.4.3. SVM Regression Function
The mathematical prediction model utilizing a support vector machine (SVM) for prediction of concrete resistivity based on the water–cement ratio, temperature, and SR as independent variables is expressed as follows:
This equation demonstrates the application of the cubic kernel function in SVM regression to predict the target variable using the learned support vectors and coefficients. In this model, a cubic polynomial kernel of degree 3 is utilized. The kernel function is defined as follows:
3.4.4. Linear Regression Prediction Formula
In this model, a multiple linear regression approach is used to predict the target variable y based on three independent variables: the water–cement ratio (X1), temperature (X2), and saturation ratio (X3). The general form of the linear regression equation is shown in Equation (3). The following coefficients are determined for this model:
β0= 479.92;
β1= −295.08;
β2= −0.88;
β3= −3.03.
Thus, the final regression equation is
This equation provides a simple yet effective method to estimate y based on the given input values.
3.5. Performance Comparison of Prediction Models
The performance of three different models—Matern 5/2 Gaussian process regression (GPR), linear regression, and cubic support vector machine (SVM)—was evaluated using the root mean square error (RMSE), R-squared (R
2), mean squared error (MSE), mean absolute error (MAE), maximum error, Median Absolute Error (MedAE), and Interquartile Range (IQR), as presented in
Table 2. These models were trained to predict concrete resistivity based on three critical variables: the saturation ratio (SR), temperature, and water–cement ratio. The validation results for each model are discussed below.
3.5.1. Matern 5/2 Gaussian Process Regression (GPR)
The Matern 5/2 GPR model demonstrated the best overall performance, achieving an RMSE of 5.21, an R2 value of 0.99, an MSE of 27.19, an MAE of 3.40, a maximum error of 23.16, a MedAE of 20.94, and an IQR of 10.31. These results confirm that the model achieves an exceptionally high level of predictive accuracy, effectively capturing the nonlinear relationships between input variables. The narrow IQR and relatively low MedAE indicate that the model maintains a compact error distribution, reflecting robust performance, even in the presence of extreme values. The high R2 value further suggests that almost all variations in concrete resistivity are well-explained by the model, making it the most reliable among the three.
3.5.2. Results for Linear Regression
The linear regression model exhibited the weakest performance among the three, with an RMSE of 36.26, an R2 value of 0.66, an MSE of 1,315.00, an MAE of 6.65, a maximum error of 139.01, a MedAE of 125.67, and an IQR of 59.67. These metrics indicate that the linear model struggles to capture the complex, nonlinear interactions between the input variables, resulting in a broader error distribution and higher overall error. The high maximum error and wide IQR highlight the model’s sensitivity to extreme outliers, reinforcing that a simple linear approach is insufficient for accurate prediction of concrete resistivity.
3.5.3. Results for Support Vector Machine (SVM)
The SVM model performed moderately well, achieving an RMSE of 15.92, an R2 value of 0.93, an MSE of 253.51, an MAE of 7.51, a maximum error of 74.13, a MedAE of 67.02, and an IQR of 34.49. While this model outperforms linear regression in capturing nonlinear relationships, it still falls short of the GPR model’s accuracy. The relatively high R2 value of 0.93 suggests that the SVM model effectively captures most of the variability in concrete resistivity. However, the higher MedAE and broader IQR indicate that this model introduces a more scattered error distribution, reflecting occasional large deviations from true values. These deviations likely reflect the model’s limited capacity to capture certain intricate patterns in the data rather than the presence of outliers in the input.
The results clearly indicate that the Matern 5/2 GPR model is the most effective in predicting concrete resistivity, achieving the lowest error rates and the highest R2 value. The narrow IQR and relatively low MedAE further confirm its robustness against extreme errors, making it the most reliable choice for this application. In contrast, the linear regression model performed poorly due to its inability to model complex interactions, while the SVM model provided a reasonable alternative but could not match the overall accuracy and stability of the GPR model.
3.6. Gaussian Process Regression (GPR) as the Best Model
Figure 6 presents a comparison between the actual measured values of concrete resistivity (blue points) and the predicted values obtained using the Matern 5/2 Gaussian process regression model (yellow points). The x-axis represents the record number, which corresponds to individual data points, while the y-axis represents the resistivity values.
The primary observation from this figure is the close alignment between the true and predicted values, indicating that the model is capable of capturing the underlying relationship between the input variables—saturation ratio, temperature, and water–cement ratio—and the output, i.e., resistivity. The consistency of the two curves suggests that the model effectively generalizes across different data points. However, in certain regions, particularly where the resistivity values are higher, small discrepancies between the predicted and true values are observed. This could be due to the model slightly underestimating or overestimating certain extreme values.
The high coefficient of determination (R2 = 0.99) confirms that the model explains almost all the variability in the resistivity values. The minimal deviations observed in specific regions do not significantly impact the overall model performance. These minor errors could stem from inherent variability in the data, potential measurement noise, or the limitations of the GPR model in capturing highly nonlinear behaviors beyond a certain range.
Overall, this figure strongly supports the claim that the GPR model is highly effective in predicting concrete resistivity, with only slight deviations in the case of extreme values.
Figure 7 provides a scatter plot comparing the predicted resistivity values (y-axis) against the actual measured resistivity values (x-axis). The solid diagonal line represents the ideal case where predicted values match the true values exactly, meaning every data point should ideally lie along this line if the model were perfect.
The strong clustering of points around this ideal 1:1 line indicates that the model achieves a high degree of accuracy in its predictions. A small number of data points deviate slightly, particularly in the upper range of resistivity values, suggesting that the model has minor difficulty predicting extreme cases. However, the overall distribution confirms that the GPR model does not exhibit major systematic bias, as both lower and higher values are well-represented, without a noticeable trend of underestimation or overestimation.
In terms of error metrics, the root mean square error (RMSE), which represents the average magnitude of prediction errors, is 5.21, and the mean absolute error (MAE) is 3.40, indicating that the typical absolute error is relatively low. These values demonstrate that while the model is not perfect, the errors are small enough that they do not significantly impact its reliability. The validation scatter plot, combined with the R2 value, confirms that the model generalizes well to unseen data and maintains high predictive performance.
This figure serves as further validation that the Matern 5/2 Gaussian process regression model is well-suited for this problem, with only minor limitations in handling extreme values.
Figure 8 presents a residual plot, which is a critical diagnostic tool for evaluating the performance of predictive models. The residuals, calculated as the difference between true and predicted resistivity values, are plotted against the true resistivity values. The ideal residual plot should exhibit a random scatter of points around zero, indicating that the model does not systematically overpredict or underpredict values in any specific range.
In this case, the majority of the residuals are distributed around zero, suggesting that the model does not have a significant bias. However, there are a few points where the residuals deviate more noticeably, particularly at higher resistivity values. This pattern suggests that while the model is highly accurate overall, it struggles slightly with extreme values, leading to small systematic errors in these cases.
The mean squared error (MSE) of 29.17 further quantifies the error magnitude, reinforcing that while small deviations exist, they do not substantially impact overall model accuracy. The residual analysis also confirms that the error distribution does not follow a clear trend, meaning there is no severe underfitting or overfitting occurring within a specific range of resistivity values.
This figure provides an additional perspective on the model’s performance, demonstrating that, while its predictive ability is strong, small residual deviations at high resistivity values should be acknowledged when interpreting the results.
3.7. Permutation Importance Analysis of the GPR Matérn 5/2 Model for Concrete Resistivity Prediction
A permutation importance analysis of the GPR Matérn 5/2 model was conducted to evaluate the contribution of different variable combinations—water–cement ratio (X1), temperature (X2), and SR (X3)—in predicting concrete resistivity (y). The results indicate that the model performs best when all three variables are included, achieving the lowest RMSE (5.2142), lowest MSE (27.188), and highest R2 (0.99), signifying near-perfect predictive capability. This suggests that the combined influence of X1, X2, and X3 is crucial for accurate resistivity predictions.
Among the different tested combinations, the X1X2 (water–cement ratio + temperature) model exhibited the poorest performance, with the highest RMSE (55.455), highest MSE (3075.3), and lowest R2 (0.19). This implies that temperature alone, even when paired with the water–cement ratio, is insufficient for accurate resistivity estimation. On the other hand, the X1X3 (water–cement ratio + SR) model demonstrated significantly better accuracy (R2 = 0.81 and RMSE = 26.709), indicating that the SR has a stronger influence on resistivity when combined with the water–cement ratio. The X2X3 (temperature + SR) model performed moderately (R2 = 0.58 and RMSE = 39.908), suggesting that the SR contributes positively but temperature alone does not significantly improve predictions unless paired with other key factors.
Based on these findings, the SR (
X3) appears to be the most critical variable, followed by the water–cement ratio (
X1), while temperature (
X2) has the least impact. The presence of the SR significantly enhances model performance, as demonstrated by the poor results when it is excluded (
X1X2 model). Consequently, future predictive models for concrete resistivity should prioritize the SR and the water–cement ratio, while the influence of temperature remains secondary. The GPR Matérn 5/2 model has been proven to be a highly effective approach when incorporating all three factors, emphasizing the need for a holistic approach in resistivity prediction.
Table 3 provides the performance indicator comparison for
X variable.
4. Conclusions
In conclusion, the water-to-cement (w/c) ratio plays a crucial role in determining the porosity and density of concrete, directly influencing its mechanical performance and durability. As the w/c ratio increases, porosity rises while density declines, primarily due to the formation of additional capillary pores from unbound water. This leads to a more porous, less durable matrix with lower compressive strength and increased susceptibility to chemical attack. In contrast, a lower w/c ratio results in a denser, more durable concrete matrix, which exhibits higher resistivity due to reduced porosity and improved microstructure.
This study also highlights the critical role of machine learning in accurately predicting concrete resistivity, addressing the nonlinear and multivariate nature of the relationships among key factors like the w/c ratio, temperature, and saturation ratio (SR). The following conclusions are drawn about the three examined predictive models:
The Gaussian process regression (GPR) model demonstrated the highest predictive accuracy, achieving the lowest RMSE, MSE, and MAE and the highest R2 value. This model effectively captures the complex, nonlinear relationships inherent in the data, making it particularly suitable for detailed resistivity predictions in high-performance concrete applications. Its ability to model uncertainty and nonlinear trends makes it a valuable tool for critical structural assessments and quality control.
The support vector machine (SVM) model with a cubic kernel provided a significant improvement over linear regression by mapping inputs to a higher-dimensional feature space. While it showed better generalization than linear models, it still fell short of the predictive power offered by GPR, particularly in capturing subtle interactions between the input variables. However, SVM remains a viable option for applications requiring faster computation and simpler parameter tuning, making it suitable for real-time assessments or large-scale infrastructure projects.
Linear regression, while computationally efficient and straightforward, struggled to account for the complex, nonlinear dependencies in the data, resulting in lower overall predictive accuracy. However, it may still be useful for preliminary assessments or cases where simplicity and interpretability are prioritized.
In addition to model evaluation, this study included a permutation importance analysis to identify the most influential factors affecting concrete resistivity. The results indicate that the saturation ratio (SR) is the most critical variable, followed by the water–cement ratio, while temperature has the least impact. This finding reinforces the dominant role of SR in determining resistivity, as it significantly affects the moisture content and pore structure of concrete. Understanding this hierarchy of influence is crucial for optimizing concrete design and ensuring long-term durability in challenging environmental conditions.
From an engineering perspective, these insights provide valuable guidance for selecting appropriate predictive models for concrete resistivity. The superior performance of the GPR model, combined with the critical importance of SR, suggests that this approach can be a powerful tool for quality control, performance assessment, and durability prediction in concrete structures. However, for applications requiring faster, less computationally intensive models, SVM remains a viable alternative, especially when combined with optimized hyperparameters.
While this study focused on linear regression, SVM, and GPR, chosen for their balance between interpretability, predictive power, and prior use in the literature, other models, such as decision trees or artificial neural networks, were not explored due to scope and computational constraints. Future research could expand the model comparison to include these techniques, which may better capture more complex nonlinear patterns and potentially improve prediction accuracy.