Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening

Bekdaş, Gebrail; Khalbous, Ammar; Nigdeli, Sinan Melih; Işıkdağ, Ümit

doi:10.3390/pr14071163

Open AccessArticle

Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening

¹

Department of Civil Engineering, Istanbul University-Cerrahpasa, Istanbul 34320, Türkiye

²

Department of Architecture, Mimar Sinan Fine Arts University, Istanbul 34427, Türkiye

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(7), 1163; https://doi.org/10.3390/pr14071163

Submission received: 11 March 2026 / Revised: 27 March 2026 / Accepted: 2 April 2026 / Published: 3 April 2026

(This article belongs to the Special Issue Advanced Functional Materials Design and Computation)

Download

Browse Figures

Versions Notes

Abstract

Unreinforced masonry walls exhibit limited resistance to lateral loads and, therefore, frequently require strengthening interventions. Carbon fiber reinforced polymer (CFRP) systems provide an efficient retrofit solution; however, current design procedures defined in structural guidelines require repetitive trial calculations to determine the necessary reinforcement amount. This study introduces a hybrid computational process that integrates metaheuristic optimization with symbolic regression to generate direct analytical equations for the estimation of the required CFRP area. First, a comprehensive database containing 1300 optimal strengthening scenarios was generated using the Jaya optimization algorithm under the constraints specified in ACI 440.7R and ACI 530. The resulting dataset was subsequently processed through symbolic regression using the PySR platform to identify explicit mathematical relationships between structural parameters and the optimum CFRP area. Most traditional machine learning approaches operate as black-box predictors. In contrast, the proposed approach generates interpretable closed-form expressions that can be used directly in engineering calculations. Two models were derived from the Pareto-optimal solution set. The first model is a simplified equation emphasizing algebraic simplicity. The second model prioritizes prediction accuracy. The simplified formulation achieved a coefficient of determination of approximately 0.992. The accuracy-focused model achieved a value above 0.997 with very low prediction errors. Validation studies with independent test samples showed that the obtained equations are reliable. The average error for the simplified model is below 4%, and for the high-accuracy model, it is approximately 2%. The results demonstrate that combining the optimization-generated datasets with symbolic regression makes it possible to obtain transparent design equations. These equations eliminate iterative design processes and provide a fast and reliable estimation tool for CFRP strengthening of masonry walls.

Keywords:

carbon fiber reinforced polymer (CFRP); masonry walls; Jaya algorithm; symbolic regression; PySR

1. Introduction

Unreinforced masonry (URM) walls remain ubiquitous in the global building stock, constituting a significant portion of existing structures worldwide, yet their substantial mass and inherently low tensile strength render them highly susceptible to lateral collapse under seismic events, wind, or blast loads. This brittleness and lack of reinforcement result in URM walls often failing catastrophically unless they are retrofitted, so strengthening of existing masonry is critical for safety and structural integrity [1,2].

Traditional retrofit schemes including steel or concrete jacketing and shotcrete can increase capacity, but these methods add significant dead weight and stiffness, are labor-intensive or invasive to apply at a heavy penalty. They can even exacerbate seismic forces due to increased mass [3,4].

In recent decades, externally bonded fiber-reinforced polymer (FRP) composites have emerged as a superior alternative. Among FRP options, carbon-fiber FRP (CFRP) is particularly preferred for rigid-wall flexural and shear strengthening due to its exceptionally high elastic modulus, excellent fatigue resistance, and durability, outperforming glass (GFRP) and basalt (BFRP) systems in these applications [4,5,6,7].

Carbon-fiber composites offer unmatched strength-to-weight performance, but their premium cost makes optimization essential to balance safety and economy [8]. Unfortunately, standard design codes such as ACI 440 (e.g., 440.2R and 440.7R) rely on iterative “trial-and-error” procedures. Engineers must initially assume a reinforcement ratio, calculate the structural strength, and repeat the process until code compliance is achieved. ACI 440.1R explicitly directs engineers to use this time-consuming approach [9], which Bekdaş et al. describe as overly “cumbersome,” rendering manual optimization nearly impossible [10]. Consequently, the lack of a direct design formula makes finding an economical, code-compliant CFRP solution highly tedious without computational assistance.

Researchers have long employed optimization algorithms to address this design challenge. For instance, Rahman et al. developed a Genetic Algorithm-based optimization framework for reinforced concrete beams, aiming to minimize the combined cost of CFRP plates and adhesive while satisfying both serviceability and ultimate limit-state requirements [11].

Broader population-based algorithms have also been widely adopted to explore the highly nonlinear CFRP design space. Kayabekir et al. proposed an optimization approach based on the Jaya algorithm to determine the optimum placement of CFRP strips to increase the shear capacity of reinforced concrete beams. Their findings showed that the Jaya algorithm offers a competitive and computationally efficient alternative to previously used optimization techniques [12], while Yücel conducted a comparative optimization study. In this study, it was shown that advanced methods such as Flower Pollination Algorithm and Particle Swarm Optimization can effectively determine CFRP design parameters. Furthermore, significant reductions in structural weight were achieved while maintaining compliance with design constraints [13]. Expanding these optimization strategies to wall structures, Bekdaş et al. applied the Jaya algorithm to generate large datasets of optimum CFRP configurations for cantilever walls. However, while these metaheuristic methods successfully identify code-compliant, near-optimal layouts, their standalone computational cost remains relatively high [10].

In recent years, a shift from this approach to data-driven models has been observed. This shift aims to overcome the computational burden and limited generalization capacity of metaheuristic optimization methods used alone. Therefore, artificial neural networks (ANNs) and ensemble learning models have been developed to quickly and reliably predict CFRP design requirements for different structural scenarios.

For example, Kayabekir et al. trained an ANN on metaheuristic-generated data to predict the optimum CFRP amount and orientation for the shear strengthening of RC beams [14]. Zhang et al. utilized interpretable ensemble learning methods—such as gradient-boosted trees and random forests—to estimate the flexural capacity of FRP-strengthened beams and identify key variables through feature importance analysis [15]. Furthermore, Bekdaş et al. proposed a hybrid ANN-Jaya framework for cantilever walls, training an MLP model on 500 optimized examples to rapidly predict the CFRP area with approximately 3.7% error while maintaining full ACI code compliance [10].

However, metaheuristic algorithms have some practical limitations. In particular, high computational cost and the need for numerous repetitions to obtain reliable solutions are significant disadvantages. This can be time-consuming in large-scale engineering problems [16].

In contrast, artificial intelligence models such as ANN and ensemble learning can make rapid predictions after training. However, these models generally operate as “black box” systems. The internal structure and weight distributions of the model cannot be easily interpreted. Therefore, validating the results and using them directly in design processes can be difficult. Furthermore, this lack of transparency in black-box models is considered a significant problem in terms of reliability in engineering applications [17,18].

Therefore, there is a need for an approach that can both make rapid predictions and provide clear analytical expressions. Symbolic regression offers a bridge between these two approaches. This method aims to find human-readable mathematical expressions that explain the relationships within the data. Thus, instead of complex computational models, clear “white-box” equations are obtained [19].

For example, PySR is a tool developed to discover interpretable symbolic models from data [20]. When optimum CFRP design datasets are analyzed using PySR, closed-form equations can be obtained for key design variables such as the required fiber area or reinforcement ratio. These expressions remain transparent and applicable while maintaining the predictive power of data-driven methods.

In short, symbolic regression combines the speed of learning-based models with the interpretability of traditional design equations. Thus, it overcomes the limitations of both metaheuristic methods and black-box ANN models [19,20].

Recently, symbolic regression has demonstrated strong potential across various structural applications by providing compact, interpretable, and highly accurate formulas. In structural engineering, Sorour et al. utilized PySR to model damage initiation in hybrid FRP-steel joints, achieving higher accuracy than traditional regression [21]. Similarly, Megahed developed explicit prediction equations for the shear strength of reinforced concrete (RC) deep beams, offering superior transparency compared to opaque black-box models [22]. SR has also been effectively combined with existing design rules to predict the capacity of concrete-filled steel tube (CFST) columns, outperforming standard European EC4 and AISC code estimates [23]. Beyond structural components, SR is increasingly adopted in geotechnical and earthquake engineering to capture complex nonlinear behaviors. Pham et al. proposed a general SR-based formula for the soil compression index (Cc) [24], while Almasoudi et al. developed a highly accurate (R² ≈ 0.99), physically consistent model for soil-structure interface shear strength [25]. Furthermore, Diaz et al. and Ghosh and Debbarma successfully applied PySR to predict clay swelling pressure [26] and seismic amplification factors in open ground-story buildings [27], respectively, demonstrating that PySR consistently yields both exceptional predictive performance and fully interpretable mathematical expressions.

Previous studies have used metaheuristic optimization or machine learning models to estimate CFRP strengthening requirements. However, these methods either require iterative optimization processes or rely on prediction models that are difficult to interpret. Symbolic regression applications for direct CFRP design formulation are still limited. The originality of this work lies in the development of a hybrid process combining optimization and symbolic regression. This process transforms a large number of optimum design solutions into explicit engineering equations.

The main goal of the proposed hybrid process is to directly address this gap. In this context, clear and interpretable white-box design equations are proposed for flexural strengthening of URM (unreinforced masonry) walls using CFRP. For this purpose, a comprehensive dataset consisting of 1300 optimum design scenarios, fully compliant with ACI 440.7R and ACI 530 guidelines, was created. This dataset was generated using the Jaya algorithm. Two different transparent mathematical models that directly predict the required CFRP area (A_f) were obtained using symbolic regression implemented via PySR in this study. The first model focuses on high accuracy. The second model offers a more balanced and simplified structure. It is designed for practical engineering applications. The obtained models were validated with independent test cases. This study provides direct analytical equations by eliminating complex iterative design processes. Thus, it offers structural engineers a practical design tool that ensures both safety and economic efficiency.

2. Methodology

The main steps of this study can be summarized as follows:

A large dataset containing optimum CFRP strengthening solutions for masonry walls was generated using a metaheuristic optimization framework based on the Jaya algorithm.
Symbolic regression was used to determine the relationship between structural parameters. The required CFRP area and explicit analytical expressions were obtained.
Two interpretable design equations were developed.
The proposed formulas were validated with previously unseen design cases.
The developed equations offer a fast alternative to the traditional iterative calculation procedures recommended in design guidelines.

2.1. Jaya Algorithm

The Jaya algorithm is a powerful and population-based meta-heuristic optimization method first proposed by Rao [28]. It is derived from the Sanskrit word Jaya, meaning “victory”. The algorithm is based on the principle that the best solution should be obtained by approaching the best candidate solution in the population while simultaneously moving away from the worst solution. Unlike other evolutionary algorithms such as Genetic Algorithms (GA) or Particle Swarm Optimization (PSO), this method does not require the adjustment of specific control parameters. For example, parameters such as crossover probability, mutation rate, or inertia weights are not present in this method. Therefore, the Jaya algorithm is described as a “parameter-free” method. The algorithm only needs two basic inputs: population size and number of iterations. This feature makes the method easy to use and computationally efficient for complex engineering problems [29].

The fundamental concept of Jaya combines the “survival of the fittest” principle with the collective intelligence of swarm-based methods. In every iteration, the algorithm identifies the “best” solution (global optimum for that iteration) and the “worst” solution to guide the search process, thereby maintaining a balance between the exploration of the search space and the exploitation of promising regions [30].

Mathematically, let f(x) be the objective function to be minimized. The algorithm considers a population size of n candidates (indexed k = 1, 2, …, n) and a set of m design variables (indexed j = 1, 2, …, m). At any iteration i, the value of the jth variable for the kth candidate solution, denoted as X_j_,k,i, is updated according to Equation (1):

{X^{'}}_{j, k, i} = X_{j, k, i} + r_{1} (X_{j, b e s t, i} - | X_{j, k, i} |) - r_{2} (X_{j, w o r s t, i} - | X_{j, k, i} |)

(1)

where

X_j_,k,i is the current value of the jth variable for the kth candidate.

X_j_,best,i is the value of the jth variable for the best solution obtained among the n candidates in the current iteration.

X_j_,worst,i is the value of the jth variable for the worst solution obtained among the n candidates in the current iteration.

r₁ and r₂ are random numbers uniformly distributed in the range [0, 1].

During the optimization process, each newly generated vector constitutes a potential update to the population. A “greedy selection” strategy is then employed: the new solution is compared with the corresponding current solution. If the new vector yields a better (lower) objective function value, it replaces the current solution in the population; otherwise, the old solution is retained. This generate-and-test cycle repeats until the pre-specified stopping criterion (maximum number of iterations) is reached [30,31]. The flowchart of the algorithm is illustrated in Figure 1.

In this study, the optimization process was executed with a population size of 20 and a maximum of 30,000 iterations to ensure convergence to the global optimum. The algorithm aimed to determine the optimal set of design variables. These are the CFRP strip width (W_f), strip spacing (S_f), and thickness (t_f) to minimize the total CFRP area (A_f) that satisfying the structural requirements of ACI 440.7R-10 [32].

The physical model, including the wall cross-section, height, and the schematic layout of the externally bonded CFRP plates, is illustrated in Figure 2. The input parameters and their corresponding ranges used in the optimization process, which correspond to the structural dimensions and loads depicted in Figure 2, are detailed in Table 1.

It should be noted that the CFRP layout considered in this study consists of vertically aligned strips rather than diagonal configurations. Although diagonal CFRP applications are known to be more effective for enhancing shear resistance, the primary objective of this study is flexural strengthening of masonry walls.

In flexural behavior, the dominant tensile stresses develop along the vertical direction due to bending moments. Therefore, vertical CFRP strips are more effective in resisting these tensile forces and improving the flexural capacity of the wall. This configuration is also consistent with common design practices and ACI 440 recommendations for flexural strengthening using externally bonded FRP systems.

Accordingly, the selected vertical layout is intentionally adopted to represent the governing structural behavior and ensure compatibility with the underlying design assumptions.

During the optimization process, the objective function was defined to minimize the area of CFRP material used per unit length of the wall, calculated as Equation (2):

A_{f} = {n \times t}_{f} \times W_{f} \times \frac{1000}{S_{f}}

(2)

where n denotes the number of CFRP layers, which is fixed at one in the present study. All design variables are constrained within practical bounds to ensure feasible and constructible CFRP configurations. These side constraints, defining the admissible search space of the optimization algorithm, are presented in Equations (3)–(5). The lower and upper bounds for these design variables (W_f, S_f, and t_f) were strictly selected based on standard commercial availability and practical application constraints.

100 \leq W_{f} \leq 500

(3)

100 \leq S_{f} \leq 500

(4)

0.1 \leq t_{f} \leq 1.5

(5)

The optimization was subjected to seven critical constraints (g1 to g7) derived from ACI 440.7R-10 and ACI 530 to ensure safety and serviceability [32,33]. Any candidate solution violating these constraints was penalized with a high objective function value. The constraints are given as follows:

Tensile Stress Constraint (g1): This constraint evaluates the flexural tensile stress (F_b) calculated under the applied moment. According to ACI 530, this stress is compared with the design modulus of rupture (Φ F_r), where a strength reduction factor of Φ = 0.6 is applied. The modulus of rupture (F_r) is determined through in situ testing or estimated from standards such as ASCE 41-06 or ACI 530. Usually, if $F_{b} > {Φ F}_{r}$ , the unreinforced wall is deemed insufficient, necessitating the CFRP retrofit.

$g 1 = F_{b} - Φ \times F_{r} > 0$

(6)
FRP Strength Constraint (g2): The effective tensile stress in the CFRP reinforcement (F_fe) must remain below its design tensile strength (F_fu) to avoid rupture failure.

$g 2 = F_{f e} - F_{f u} < 0$

(7)
FRP Strain Constraint (g3): The effective strain in the CFRP (ε_fe) is limited to the design rupture strain (ε_fu) to ensure strain compatibility and prevent debonding.

$g 3 = ε_{f e} - ε_{f u} < 0$

(8)
Masonry Strain Constraint (g4): To prevent crushing failure of the masonry in the compression zone, the maximum compressive strain (ε_m) must not exceed the ultimate usable strain (ε_mu), typically taken as 0.0025.

$g 4 = ε_{m} - ε_{m u} < 0$

(9)
FRP Spacing Constraint (g5): This constraint limits the clear spacing between CFRP strips to ensure effective stress distribution and prevent local failure mechanisms.

$g 5 = 3 \times t_{f} + W_{f} - S_{f} > 0$

(10)
FRP Stress Upper Limit Constraint (g6): In the FRP system, the maximum tensile force developed for each strip width must not exceed specific threshold values. This verification guarantees that the FRP material operates within its design strength limits. Therefore, the maximum force corresponding to each width in the FRP system must satisfy the required condition.

$g 6 = t_{f} \times F_{f e} - 260 < 0$

(11)
Optimum FRP Area Constraint (g7): This final constraint ensures that the provided CFRP area (A_fprov.) in the optimized solution is equal to or greater than the theoretically required area (A_freq.).

$g 7 = A_{f p r o v .} - A_{f_{r e q} .} > 0$

(12)

After evaluating the objective function and penalizing constraint violations, the comprehensive dataset of 1300 optimized design scenarios was generated. To illustrate the range and nature of the generated variables, a representative sample consisting of seven randomly selected cases from the dataset is presented in Table 2. This dataset subsequently served as the foundational input for applying Symbolic Regression via the PySR package. By processing the data, PySR discovered explicit, closed-form mathematical equations to predict the required optimum CFRP area (A_f) based directly on the structural parameters of the masonry walls. To validate the generalization capability of the proposed analytical equations, ten independent test cases were generated using the same optimization framework. These specific cases were strictly excluded from the regression process and were utilized solely as unseen benchmarks to evaluate the models’ accuracy as detailed in Table 3.

2.2. Symbolic Regression and PySR

Symbolic regression (SR), which is a data-driven method, is employed in this study. This method aims to find short mathematical expressions that reveal relationships in datasets. It is different from other regression and machine learning methods because it focuses on interpretability and structural exploration. Traditional machine learning models (including ensemble methods like deep neural networks and random forests) often operate as “black boxes” [34,35].

Black-box models provide high predictive power but are not transparent. This can make scientific interpretation and model validation difficult. In contrast, standard parametric regression methods require the researcher to specify the functional form. A linear relationship or a polynomial of a certain degree might be assumed. However, if the actual data structure differs from these assumptions, this can introduce bias into the model. Symbolic regression overcomes these limitations. It does so by using evolutionary algorithms inspired by the principles of natural selection. In this process, a diverse population of candidate equations is created. These equations are usually represented by tree structures. They iteratively evolved using genetic operations. Mutation makes random changes to the equations to create new variations. Crossover allows the transfer of beneficial traits by combining the sub-expressions of successful individuals. Selection mechanisms eliminate low-performing candidates and allow the advancement of better solutions [36]. The result of this process is clear and interpretable “white-box” equations. Because these equations are directly understandable, they can be easily integrated into theoretical frameworks and engineering designs [37].

PySR is a high-performance, open source implementation of symbolic regression. It is specifically designed for scientific and engineering applications requiring speed and ease of use. The tool is built on the Julia programming language due to its powerful features in numerical computation. Thanks to Julia’s features such as just-in-time compilation and parallel processing support, PySR can perform very fast evolutionary search processes [20].

PySR’s Python 3.12 interface offers a user-friendly structure. It is also compatible with popular data science libraries such as scikit-learn and pandas. Thus, it can be easily integrated into existing data analysis workflows [38]. The evolutionary core of the tool uses a multi-population strategy. Candidate equations are distributed to increase diversity and to improve the convergence performance. The basic mechanisms include mutation, crosover and tournament selection [20]. This structure also allows for the definition of custom operators and constraints. Thus, domain-specific regulations can be done including adding physical units or prohibiting certain mathematical operations [39].

PySR uses the Pareto front and it chooses the best models. This approach addresses the natural balance between accuracy and complexity in symbolic expressions. Accuracy is evaluated using loss functions, which measure how well the equation fits the data. These metrics may include root mean squared error (RMSE) or custom domain-specific loss functions [20]. Complexity is evaluated using different criteria, including the number of operators, tree depth, and overall expression length. This approach aims to reduce overfitting and increase the generalizability of the model. This is particularly important in fields such as structural engineering. Overly complex models may be physically difficult to interpret [40]. As a result of Pareto optimization, PySR produces a set of ordered models consisting of non-dominated equations. Users can choose from these models according to their needs. For example, simpler models may be preferred to facilitate analytical processes in design [41]. This multi-objective optimization approach is based on the evolutionary computation literature. Pareto dominance ensures the preservation of different solutions. Furthermore, various test studies have shown that PySR is successful in rediscovering known physical laws from noisy data [20,34].

In this study, a dataset consisting of 1300 design scenarios optimized with the Jaya algorithm was randomly divided into two subsets. 80% of the dataset was used for training and 20% for testing. PySR models were trained using only the training data. The test dataset was kept completely separate to evaluate the generalization ability of the models.

The main objective of this study is to develop a closed-form equation that directly predicts the required CFRP area (A_f) based on the following structural parameters: wall thickness (t), masonry compressive strength (F_m), applied axial load (P_u), ultimate moment demand (M_u), CFRP rupture strength (F_fu′), CFRP modulus of elasticity (E_f).

To ensure the reproducibility of the study and to obtain physically interpretable equations, the PySR algorithm was implemented in the Python environment with specific hyperparameters. The mathematical search space was constrained to fundamental binary operators and specific unary operators. Furthermore, the maximum structural complexity of the generated equations was limited to 30 nodes (maxsize = 30). This limitation was implemented to prevent overfitting. The evolutionary search process was run over 40 iterations (niterations = 40). The algorithm used 15 subpopulations (populations = 15). Each subpopulation contained 33 individuals (population_size = 33). To ensure the stochastic evolutionary process is fully reproducible, a fixed random state value was used (random_state = 42). During the study, PySR generated several candidate models on the Pareto front.

Two different mathematical models were selected from this set of non-dominated models. The first model was obtained using a balanced selection strategy (model_selection = “best”). This approach optimizes the balance between prediction accuracy and algebraic complexity. The second model was selected using an accuracy-focused strategy (model_selection = “accuracy”). This approach minimizes only the loss function and does not consider equation complexity. The explicit mathematical expressions and statistical performance of the selected models are analyzed in detail in the next section.

3. Results and Discussion

This section presents the results of the symbolic regression analysis performed using the PySR framework on a dataset optimized by JA. The explicit closed-form design equations derived from the Pareto front are introduced and these equations demonstrate the optimal balance between algebraic simplicity and predictive accuracy. The statistical performance of the proposed models is evaluated in detail by using both the training and testing datasets. The generalization capability of the obtained formulas is validated using independent and previously unseen test data. A global sensitivity analysis is performed to determine the relative influence of structural and material parameters on the required CFRP area.

3.1. Proposed Explicit Design Equations and Pareto Front Analysis

In symbolic regression, the optimization process does not produce a single formula and it generates a diverse population of models consisting of different mathematical expressions. These candidate models are evaluated and distributed along the Pareto front. The Pareto front visualizes the natural balance between model accuracy and structural complexity. Model accuracy is measured by minimizing the loss function.

Structural complexity is determined by the number of mathematical nodes, operators, and constants. In this study, two different closed-form equations were obtained to predict the optimum CFRP area (A_f) required for the flexural strengthening of masonry walls. Table 4 shows the candidate expressions evaluated using a balanced selection strategy. Table 5 presents the expressions obtained when only prediction accuracy is prioritized.

In the context of the PySR algorithm, model complexity is defined as the total number of nodes in the mathematical expression tree. It is calculated by summing the predefined weights of all constituent elements in the formula, where each variable, constant, and mathematical operator (e.g., +, −, ×, ÷, log, sqrt) is counted as a separate node with a baseline weight of 1. Therefore, the complexity values reported in Table 4 and Table 5 directly represent the structural size of each candidate expression. This metric is fundamentally used in the Pareto optimization process to balance predictive accuracy against equation simplicity; lower complexity values correspond to simpler, highly interpretable equations, while higher values indicate more sophisticated mathematical structures aimed at capturing deeper nonlinear mechanics.

The first proposed equation chosen as Model 1 was extracted from the Pareto front utilizing a “balanced” selection criterion (model_selection = best). It evaluates candidate models based on a penalized score that strongly favors algebraic simplicity while maintaining high predictive precision. This model has a complexity score of 18. Despite its slightly higher node count, Model 1 is practically advantageous because of its structure, which is strictly composed of fundamental arithmetic operations and a straightforward quadratic term, completely avoiding transcendental or logarithmic functions. The explicit formula for the balanced model is presented in Equation (13):

A_{f} = 11.1 {\times (1 + \frac{0.0968 \times M_{u}}{{F_{f u}}^{'} \times t})}^{2} - 8.46

(13)

The second proposed equation chosen as Model 2 was derived from the purely accuracy-driven Pareto front (Table 5) using an “accuracy-focused” selection strategy (model_selection = “accuracy”). It prioritizes the absolute minimization of the prediction error. Interestingly, this approach is a highly accurate formulation with a slightly lower node complexity score of 17. However, to capture the deep non-linear behavioral relationships within the dataset and maximize the coefficient of determination, this model incorporates more advanced unary operators, specifically the natural logarithm and square root functions. The explicit formula for the accuracy-focused model is expressed in Equation (14), where A_f is the required area of the CFRP reinforcement (mm²/m), M_u denotes the ultimate moment demand (N·mm/m), F_fu′ represents the CFRP rupture strength (MPa), t is the thickness of the unreinforced masonry wall (mm), and E_f corresponds to the CFRP modulus of elasticity (MPa).

A_{f} = M_{u} (1.16 \times 10^{- 7} + \frac{3.50}{{F_{f u}}^{'} \times (t + \log (\sqrt{E_{f}}) - 13.7)}) - 2.60

(14)

Both models autonomously eliminated the applied axial load (P_u) and the masonry compressive strength (F_m) from their final expressions. The evolutionary algorithm determined that these parameters possessed a negligible impact on the optimal CFRP flexural area relative to the dominant variables (M_u, F_fu′, and t).

3.2. Predictive Performance and Statistical Accuracy

In order to numerically evaluate the reliability and generalization ability of the proposed symbolic regression models, 1300 optimal design scenarios were used. The dataset was divided into 80% training (1040 samples) and 20% testing (260 samples). Prediction performance was evaluated using four standard statistics: Coefficient of Determination (R²), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). The performance of the balanced model (Model 1) and the accuracy-focused model (Model 2) are summarized in Table 6.

As seen in Table 6, both models make very good predictions. Model 1 achieved R² = 0.9913 and R² = 0.9927 on the training and test sets, respectively. The small difference between training and testing indicates that the model captures real physical relationships without memorizing the training data. Model 2 provided higher accuracy; in both datasets, R² > 0.997 and the test MAPE was only 1.32%.

In order to further demonstrate the reliability of the predictions, residual analysis was performed on the entire dataset (1300 samples) using the accuracy-focused model (Model 2). The mean error was found to be almost zero (µ = −0.000163) and the standard deviation was low (σ = 0.3183). The lowest and highest residuals ranged from −0.9040 to 0.9951. The near-zero mean indicates that the equation is unbiased and does not systematically overestimate or underestimate the CFRP area. This unbiasedness ensures that the designs are both safe and economical.

3.3. Validation on Unseen Data (Independent Test Cases)

Statistics obtained from 20% of the test data show that the models are consistent. However, to see the true generalization ability of the models, testing with completely independent data is necessary. For this purpose, ten new design scenarios were created using the Jaya algorithm. These examples were strictly isolated from the first 1300 examples and were not encountered by the PySR algorithm during training or model selection.

The optimum CFRP areas (A_f) found with the Jaya algorithm were compared with predictions made using two proposed equations. Table 7 shows the results for the algebraically simple balanced model (Model 1).

According to Table 7, the balanced equation successfully predicted the required CFRP area. The average error was 3.26%. Even the highest error (6.82%, Case 8) is within structural tolerances. Since Model 1 only uses basic mathematical operations, this accuracy is very useful for fast and manual calculations.

The same ten examples were also tested with the accuracy-focused equation (Model 2). The results are shown in Table 8.

The results in Table 8 clearly illustrate the superior predictive precision of Model 2. By incorporating logarithmic and square root operators to better capture the deep non-linear mechanics of the masonry walls, the average prediction error was substantially reduced to 2.10%. Remarkably, for specific scenarios such as Case 4, Case 5 and Case 10, the percentage error plummeted to less than 1%.

These validation outcomes definitively prove that both derived models are highly robust and ready for practical application. Structural engineers can confidently employ Model 1 for straightforward, preliminary design calculations or utilize Model 2 within computational spreadsheets when absolute precision and material optimization are critical priorities.

It is important to explicitly position the proposed framework as a data-driven engineering tool. The primary objective of the derived PySR equations is to serve as explicit surrogate models for the established ACI 440.7R and ACI 530 design guidelines. Consequently, the validation process in this study evaluates the models’ ability to replicate the exact iterative solutions dictated by these codes, rather than directly predicting standalone experimental test results. Since the ACI code provisions are inherently derived from, and rigorously calibrated against, comprehensive experimental databases of masonry structures, the proposed equations implicitly inherit this robust empirical validity. This ensures that the derived formulas remain structurally safe, reliable, and practically applicable for real-world design scenarios without the need for additional independent experimental calibration.

3.4. Feature Sensitivity Analysis

To thoroughly understand the governing mechanics behind the CFRP strengthening design and to quantify the global importance of each input parameter on the predicted CFRP area (A_f), three complementary sensitivity analysis approaches were employed: (i) global gradient-based sensitivity using mean absolute partial derivatives, (ii) local One-At-a-Time (OAT) elasticity analysis, and (iii) variance-based Sobol global sensitivity analysis (Figure 3, Figure 4 and Figure 5).

The use of multiple independent methods ensures robustness and avoids methodological bias, as each technique quantifies a different aspect of input–output dependency. The gradient-based global sensitivity analysis (Figure 3) revealed that the variable t exhibited the highest absolute sensitivity, accounting for approximately 92.8% and 93.0% of the total sensitivity in Model 1-best and Model 2-accuracy, respectively, while F_fu contributed approximately 7%, and all other variables showed negligible gradients. This result is mathematically consistent with the symbolic model equations, where t appears in the denominator of nonlinear terms, causing relatively large absolute output changes per unit variation. However, it is important to emphasize that gradient-based sensitivity reflects absolute changes and is therefore influenced by the scale and units of the variables.

To overcome this limitation, scale-independent elasticity analysis was performed. The elasticity results demonstrated that t, F_fu, and M_u all exhibited similar relative influence, with elasticity values exceeding unity in both models (e.g., Model2: t = 1.211, F_fu = 1.171, M_u = 1.145), indicating that proportional changes in these variables produce comparable proportional changes in the output. This finding confirms that the system behavior is governed collectively by these three variables rather than by a single dominant parameter.

In particular, the variance-based Sobol global sensitivity analysis showed that the variable most affecting the variance of the output was mu. The total-order Sobol indices for Model 1 and Model 3 were approximately 0.609 and 0.633, respectively. This was followed by t and F_fu. Sobol analysis provides the most complete measure of global importance because it measures the contribution of each variable to the total output variance across the entire input range. The dominance of the M_u variable indicates that variations in this parameter explain the largest portion of the variance in the output; even if its magnitude appears small, it can be misleading due to scale effects.

The consistency between the Sobol indices, the elasticity analysis, and the mathematical structure of the symbolic equations strongly supports the idea that the models are governed by physically meaningful relationships rather than numerical errors or overfitting. Furthermore, the remaining variables (F_m, P_u, and E_f) showed negligible sensitivity in all three independent methods, confirming that these variables do not significantly affect model estimates within the studied domain.

Overall, the agreement of results from different independent sensitivity analysis techniques indicates that the predictive behavior of the proposed models is primarily controlled by M_u, t, and F_fu. This confirms the internal consistency, interpretability, and physical meaning of the derived symbolic relationships.

4. Discussion

In current applications, determining the required CFRP area (A_f) largely relies on the ACI 440.7R-10 and ACI 530 design regulations. However, these methods naturally require a rather laborious trial-and-error process. This iterative process is not only time-consuming but also makes it practically difficult to effectively optimize expensive CFRP materials. The explicit mathematical equations obtained in this study directly eliminate this iterative burden. Thus, structural engineers can calculate the optimum reinforcement area in a single, direct mathematical step. This approach ensures both structural safety and material economy.

To clearly demonstrate the novelty of this research, the proposed method is presented comparatively with existing studies on structural strengthening in Table 9. Early research mostly relied on metaheuristic algorithms to overcome the manual use of design regulations. Later, Artificial Neural Networks (ANNs) and ensemble models were widely proposed in the civil engineering literature to make rapid predictions and reduce design iterations. However, as seen in Table 9, these machine learning models are mostly opaque “black box” systems. These models require the use of complex weighting matrices in computational environments. This makes their validation and reliability quite difficult, especially in structural engineering applications where life safety is critical.

Different from previous black-box methods, the proposed PySR method transforms highly nonlinear structural mechanical relationships into transparent and analytical “white-box” formulas. Once these formulas are obtained, they are not dependent on any software environment. From a practical standpoint, the two proposed models offer engineers a flexible design tool. Model 1 (equilibrium-oriented equation) consists only of basic arithmetic operations. Therefore, it is suitable for rapid preliminary sizing, field applicability checks, and manual calculations. In contrast, Model 2 (accuracy-oriented equation) includes logarithmic and square root functions, thus providing a very high accuracy of 0.997 R². This model is particularly suitable for the final design phase. Although mathematically more complex, it can be easily integrated into standard engineering spreadsheets (e.g., Microsoft Excel) or automated structural design software. This ensures high accuracy and maximum material economy.

The physical validity of the proposed transparent models is strongly supported by the study’s findings, demonstrating clear consistency with actual structural performance behavior. As shown by the perturbation-based sensitivity analysis (Section 3.4), the PySR algorithm automatically excluded the axial load (P_u) and masonry compressive strength (F_m) from the final explicit equations. This outcome is physically meaningful and consistent with the mechanics of CFRP flexural strengthening. The required CFRP area primarily represents a tensile demand associated with bending behavior.

While the axial load (P_u) introduces a pre-compression effect that can delay tensile cracking, its influence within the investigated design domain is relatively limited compared to the dominant effect of the ultimate moment (M_u). Furthermore, due to ACI-based design constraints that limit the maximum compressive strain to prevent brittle masonry crushing, the feasible solutions are inherently restricted to tension-controlled behavior. Consequently, the magnitude of Fm does not govern the required CFRP area within this framework.

Accordingly, the models correctly assign the highest sensitivity to the ultimate moment demand (M_u), CFRP rupture strength (F_fu′), and wall thickness (t). In such systems, the tensile demand is primarily resisted by the CFRP reinforcement, rendering P_u and F_m secondary. As expected from established structural mechanics principles, the required CFRP area increases with increasing moment demand and decreases with higher CFRP strength, which directly aligns with ACI-based design provisions. These results confirm that the proposed equations are not merely data-driven curve fits, but physically interpretable and reliable representations of real structural behavior.

However, an inherent limitation of this data-driven approach is that the applicability of the derived equations is strictly bounded by the parameter ranges of the generated dataset (which are comprehensively detailed in Table 1) and the assumptions of the underlying ACI codes. Therefore, extrapolating these equations beyond this specific design domain—such as applying them to vastly different material strengths, wall dimensions, or loading conditions—should be performed with caution.

Beyond the physical interpretation, the resulting equations demonstrate a strong generalizability. In a validation study on independent and previously unseen data (Section 3.3), the accuracy-focused equation (Model 2) showed a very low deviation with an average error of 2.10%. The balanced equation (Model 1) provided a highly acceptable error rate of 3.26% for engineering applications. These small deviations demonstrate that the models not only memorize training data but also learn real structural mechanical relationships. Consequently, this method provides structural engineers with a reliable tool containing clear mathematical expressions and is extremely fast in terms of computation. This tool contributes to the development of safe, economical, and regulatory-compliant solutions in the strengthening of unreinforced masonry structures.

5. Conclusions

This study presents a novel “white-box” artificial intelligence framework that combines the Jaya optimization algorithm with symbolic regression (PySR) to derive explicit closed-form design equations for the flexural strengthening of unreinforced masonry walls using CFRP. By generating a comprehensive dataset consisting of 1300 optimized design scenarios, the proposed approach eliminates the iterative trial-and-error procedures required by ACI guidelines while overcoming the lack of transparency associated with conventional black-box machine learning models.

The developed explicit equations demonstrated very high predictive accuracy across all statistical measures. The accuracy-focused model (Model 2) achieved a testing R² of 0.9978, RMSE of 0.3186, MAE of 0.2248, and MAPE of only 1.32%. The balanced model (Model 1) also exhibited strong predictive performance, with a testing R² of 0.9927 and a MAPE of 2.85%. These results confirm the ability of symbolic regression to accurately capture complex nonlinear relationships governing CFRP strengthening behavior.

Validation using ten completely independent structural design scenarios further confirmed the robustness and generalization capability of the proposed models. Model 2 predicted the optimum CFRP area with an average error of 2.10%, while Model 1 maintained a practical error level of 3.26%. The maximum observed deviation remained below 6.85%, which falls within acceptable engineering tolerances. These findings indicate that the proposed equations successfully capture the underlying structural mechanics rather than merely fitting the training data.

Perturbation-based sensitivity analysis revealed that the PySR algorithm effectively identifies the governing structural parameters. The resulting models consistently showed that the ultimate moment demand (M_u), wall thickness (t), and CFRP rupture strength (F_fu′) are the dominant variables controlling the required CFRP area. In contrast, axial load (P_u) and masonry compressive strength (F_m) were automatically excluded due to their negligible influence within the investigated design domain. This behavior is consistent with the tension-controlled flexural response of CFRP-strengthened masonry walls.

In addition to their predictive accuracy, the proposed models offer significant practical advantages. The derived equations exhibit relatively low mathematical complexity (18 for Model 1 and 17 for Model 2), enabling straightforward implementation in both manual calculations and standard engineering software environments. Model 1 is particularly suitable for rapid preliminary design and field applications due to its simplicity, while Model 2 provides high-precision predictions for final design and optimization purposes.

Overall, the proposed approach transforms a large set of optimization-based design results into direct analytical expressions, achieving high predictive performance (R² > 0.99) while significantly reducing computational effort and design time compared to traditional code-based iterative procedures. This contributes to the development of fast, reliable, and interpretable design tools for structural engineers.

Future studies will focus on extending the proposed framework to shear strengthening applications, investigating different composite materials such as GFRP and BFRP, and analyzing the behavior of masonry walls under out-of-plane loading conditions. Furthermore, validation using independent experimental and literature-based datasets is recommended to further enhance the robustness and general applicability of the proposed models.

Author Contributions

Conceptualization, G.B., A.K. and S.M.N.; methodology, G.B., A.K. and S.M.N.; software, G.B. and A.K.; validation, A.K. and S.M.N.; formal analysis, G.B., A.K. and S.M.N.; investigation, A.K.; resources, A.K.; data curation, A.K.; writing—original draft preparation, A.K.; writing—review and editing, G.B., S.M.N. and Ü.I.; visualization, A.K.; supervision, G.B., Ü.I. and S.M.N.; project administration, G.B. and Ü.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Babatunde, S.A. Review of strengthening techniques for masonry using fiber reinforced polymers. Compos. Struct. 2017, 161, 246–255. [Google Scholar] [CrossRef]
Celik, T. Strengthening strategies for unreinforced stone masonry walls using FRP and CFM composites. Sci. Rep. 2025, 15, 28167. [Google Scholar] [CrossRef] [PubMed]
Ferretti, E.; Pascale, G. Some of the Latest Active Strengthening Techniques for Masonry Buildings: A Critical Analysis. Materials 2019, 12, 1151. [Google Scholar] [CrossRef] [PubMed]
Mercimek, Ö.; Yılmaz, M.C.; Yasin, M.; Akkaya, S.T.; Çelik, A.; Bıçakçıoğlu, K.; Erbaş, Y.; Anıl, Ö. An experimental study about unreinforced masonry panels strengthening with CFRP: Out-of-plane failure. Eng. Fail. Anal. 2025, 181, 109932. [Google Scholar] [CrossRef]
Rajak, D.K.; Wagh, P.H.; Linul, E. Manufacturing Technologies of Carbon/Glass Fiber-Reinforced Polymer Composites and Their Properties: A Review. Polymers 2021, 13, 3721. [Google Scholar] [CrossRef] [PubMed]
Duan, S.-J.; Feng, R.-M.; Yuan, X.-Y.; Song, L.-T.; Tong, G.-S.; Tong, J.-Z. A Review on Research Advances and Applications of Basalt Fiber-Reinforced Polymer in the Construction Industry. Buildings 2025, 15, 181. [Google Scholar] [CrossRef]
Torunbalcı, N.; Onar, E.; Günay, H. Structural evaluation of masonry walls with double-sided CFRP reinforcement through diagonal compression tests. J. Build. Eng. 2025, 111, 113142. [Google Scholar] [CrossRef]
Monti, G.; Napoli, A.; Realfonzo, R. Reliable design and optimization of SRP-, CFRP-, and GFRP-confined concrete: Experimental validation and cost-effectiveness analysis. Mater. Struct. 2025, 58, 350. [Google Scholar] [CrossRef]
ACI Committee 440. Guide for the Design and Construction of Structural Concrete Reinforced with FRP Bars (ACI 440.1R-06); American Concrete Institute: Farmington Hills, MI, USA, 2006; Available online: https://www.concrete.org/publications/internationalconcreteabstractsportal/m/details/id/15613 (accessed on 16 February 2026).
Bekdaş, G.; Khalbous, A.; Nigdeli, S.M.; Işıkdağ, Ü. Optimum Carbon Fiber Reinforced Polymer (CFRP) Design for Flexural Strengthening of Cantilever Concrete Walls Using Artificial Neural Networks. Polymers 2025, 17, 3300. [Google Scholar] [CrossRef]
Rahman, M.M.; Jumaat, M.Z.; Hosen, M.A. Genetic Algorithm for Material Cost Minimization of External Strengthening System with Fiber Reinforced Polymer. Adv. Mater. Res. 2012, 468–471, 1817–1822. [Google Scholar] [CrossRef]
Sayın, B.; Kayabekir, A.E.; Nigdeli, S.M.; Bekdaş, G. Jaya algorithm based optimum carbon fiber reinforced polymer design for reinforced concrete beams. In Proceedings of the 15th International Conference of Numerical Analysis and Applied Mathematics (ICNAAM 2017), Thessaloniki, Greece, 25–30 September 2017; pp. 1–4. Available online: https://www.researchgate.net/publication/326358881_Jaya_algorithm_based_optimum_carbon_fiber_reinforced_polymer_design_for_reinforced_concrete_beams/citations (accessed on 15 February 2026).
Yücel, M. Comparison of Flower Pollination Algorithm and Particle Swarm Optimization for Structural Weight Minimization of RC Beams with Carbon Fiber Reinforced Polymer (CFRP). Afyon Kocatepe Üniv. Fen ve Mühendis. Bilim. Derg. 2025, 25, 381–387. [Google Scholar] [CrossRef]
Kayabekir, A.E.; Yücel, M.; Bekdaş, G.; Nigdeli, S.M. An Artificial Neural Network Model for Prediction of Optimum amount of Carbon Fiber Reinforced Polymer for Shear Capacity Improvement of Beams. In Proceedings of the 12th International Congress on Mechanics (HSTAM2019), Thessaloniki, Greece, 22–25 September 2019; pp. 1–8. Available online: https://www.researchgate.net/publication/336047843_AN_ARTIFICIAL_NEURAL_NETWORK_MODEL_FOR_PREDICTION_OF_OPTIMUM_AMOUNT_OF_CARBON_FIBER_REINFORCED_POLYMER_FOR_SHEAR_CAPACITY_IMPROVEMENT_OF_BEAMS (accessed on 15 February 2026).
Zhang, S.-Y.; Chen, S.-Z.; Jiang, X.; Han, W.-S. Data-Driven Prediction of FRP Strengthened Reinforced Concrete Beam Capacity Based on Interpretable Ensemble Learning Algorithms. Structures 2022, 43, 860–877. [Google Scholar] [CrossRef]
Tomar, V.; Bansal, M.; Singh, P. Metaheuristic Algorithms for Optimization: A Brief Review. Eng. Proc. 2023, 59, 238. [Google Scholar] [CrossRef]
Rudin, C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef] [PubMed]
Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
Aldeia, G.S.I.; Zhang, H.; Bomarito, G.; Cranmer, M.; Fonseca, A.; Burlacu, B.; La Cava, W.G.; de França, F.O. Call for Action: Towards the next generation of symbolic regression benchmark. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’25 Companion), New York, NY, USA, 14–18 July 2025; pp. 2529–2538. [Google Scholar] [CrossRef]
Cranmer, M. Interpretable Machine Learning for Science with PySR and Symbolic Regression.jl. arXiv 2023, arXiv:2305.01582. [Google Scholar] [CrossRef]
Sorour, S.S.; Saleh, C.A.; Shazly, M. Integrating machine learning and symbolic regression for predicting damage initiation in hybrid FRP bolted connections. Sci. Rep. 2025, 15, 18564. [Google Scholar] [CrossRef]
Megahed, K. Prediction and reliability analysis of shear strength of RC deep beams. Sci. Rep. 2024, 14, 14590. [Google Scholar] [CrossRef]
Megahed, K. Symbolic regression for strength prediction of eccentrically loaded concrete-filled steel tubular columns. Sci. Rep. 2025, 15, 3085. [Google Scholar] [CrossRef]
Pham, K.; Nguyen, K.; Lim, K.; Kim, Y.; Choi, H. A generalized formula for predicting soil compression index using multi-evolutionary algorithm. Eng. Geol. 2024, 343, 107789. [Google Scholar] [CrossRef]
Almasoudi, R.; Baghbani, A.; Abuel-Naga, H. Interpretable AI-Driven Modelling of Soil–Structure Interface Shear Strength Using Genetic Programming with SHAP and Fourier Feature Augmentation. Geotechnics 2025, 5, 69. [Google Scholar] [CrossRef]
Díaz, E.; Tomás, R. Predicting Clay Swelling Pressure: A Comparative Analysis of Advanced Symbolic Regression Techniques. Appl. Sci. 2025, 15, 5603. [Google Scholar] [CrossRef]
Ghosh, R.; Debbarma, R. Interpretable ML Model for Predicting Magnification Factors in Open Ground-Storey Columns to Prevent Soft-Storey Collapse. Buildings 2025, 15, 3383. [Google Scholar] [CrossRef]
Rao, R.V. Jaya: A Simple and New Optimization Algorithm for Solving Constrained and Unconstrained Optimization Problems. Int. J. Ind. Eng. Comput. 2016, 7, 19–34. [Google Scholar] [CrossRef]
Bekdaş, G.; Nigdeli, S.M.; Yücel, M.; Kayabekir, A.E. Artificial Intelligence Optimization Algorithms and Engineering Applications; Seçkin Publishing: Istanbul, Türkiye, 2021; pp. 67–70. Available online: https://www.seckin.com.tr/kitap/yapay-zeka-optimizasyonalgoritmalari-ve-muhendislik-uygulamalari-kavram-uygulama-kodla-ma-gebrail-bekdas-sinan-melih-nigdeli-melda-yucelaylin-ece-kayabekir-s-p-822965274?srsltid=AfmBOooS7GzBr9KKHMmwUDTW5XrIesyfGTNERUW8ZBUR_I6DTND68Aho (accessed on 10 February 2026).
Nigdeli, S.M.; Bekdaş, G. A New Modified Jaya Algorithm for Optimum Design of Tuned Mass Dampers. In Proceedings of the 14th International Conference on Evolutionary and Deterministic Methods for Design, Optimization and Control (EUROGEN-2021), Athens, Greece, 28–30 June 2021; pp. 1–10. [Google Scholar] [CrossRef]
Duysak, Y.; Nigdeli, S.M.; Bekdaş, G. Optimum Design of Reinforced Concrete Beam Sections with JAYA Algorithm. Chall. J. Concr. Res. Lett. 2024, 2, 4. [Google Scholar] [CrossRef]
ACI Committee 440. Guide for the Design and Construction of Externally Bonded FRP Systems for Strengthening Masonry Structures (ACI 440.7R-10); American Concrete Institute: Farmington Hills, MI, USA, 2010; Available online: https://www.scribd.com/document/688499816/Kupdf-net-Aci-4407r-10-Guide-for-the-Design-and-Construction-of-Externally-Bonded-Fiber-Reinforced-Polymer-Systems-for-Strengthening-Unreinforced-Maso (accessed on 1 February 2026).
ACI Committee. Building Code Requirements and Specification for Masonry Structures (ACI 530-13) and Companion Commentaries; American Concrete Institute: Farmington Hills, MI, USA, 2013; Available online: https://www.researchgate.net/publication/283033528_Building_Code_Requirements_and_Specification_for_Masonry_Structures (accessed on 1 February 2026).
La Cava, W.; Burlacu, B.; Virgolin, M.; Kommenda, M.; Orzechowski, P.; de França, F.O.; Jin, Y.; Moore, J.H. Contemporary Symbolic Regression Methods and their Relative Performance. Adv. Neural Inf. Process. Syst. 2021, 2021, 1–16. [Google Scholar]
Udrescu, S.-M.; Tegmark, M. AI Feynman: A physics-inspired method for symbolic regression. Sci. Adv. 2020, 6, eaay2631. [Google Scholar] [CrossRef]
Radwan, Y.A.; Kronberger, G.; Winkler, S.M. A Comparison of Recent Algorithms for Symbolic Regression to Genetic Programming. In Proceedings of the International Conference/Workshop on Computer Aided Systems Theory; Springer Nature: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
Märtens, M.; Kuipers, F.; Van Mieghem, P. Symbolic Regression on Network Properties. In Genetic Programming. EuroGP 2017. Lecture Notes in Computer Science; McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P., Eds.; Springer: Cham, Switzerland, 2017; Volume 10196. [Google Scholar] [CrossRef]
Tonda, A. Review of PySR: High-performance symbolic regression in Python and Julia. Genet. Program. Evolvable Mach. 2025, 26, 7. [Google Scholar] [CrossRef]
de França, F.O.; Virgolin, M.; Kommenda, M.; Majumder, M.; Cranmer, M.; Espada, G.; Ingelse, L.; Fonseca, A.; Landajuela, M.; Petersen, B.K.; et al. Interpretable Symbolic Regression for Data Science: Analysis of the 2022 Competition. arXiv 2023, arXiv:2304.01117. [Google Scholar] [CrossRef]
Makke, N.; Chawla, S. Interpretable scientific discovery with symbolic regression: A review. Artif. Intell. Rev. 2024, 57, 2. [Google Scholar] [CrossRef]
Keren, L.S.; Liberzon, A.; Lazebnik, T. A computational framework for physics-informed symbolic regression with straightforward integration of domain knowledge. Sci. Rep. 2023, 13, 1249. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the optimization phase of Jaya.

Figure 2. Masonry Wall Cross-Section, Height, and CFRP Plate Placement.

Figure 3. Global Sensitivity Plot.

Figure 4. OAT Sensitivity Plot.

Figure 5. Sobol Global Sensitivity Indices.

Table 1. Input parameters and ranges for the optimization process.

Parameter	Symbol	Range
Wall thickness	t	150–350 mm
Masonry compressive strength	F_m	5–25 MPa
Axial load	P_u	5000–10,000 N/m
Ultimate moment demand	M_u	2,000,000–10,000,000 N·mm/m
CFRP rupture strength	F_fu′	2500–4500 MPa
CFRP modulus of elasticity	E_f	140,000–170,000 MPa

Table 2. A representative sample of the optimum CFRP design dataset generated by the Jaya algorithm.

Case	t (mm)	F_m (MPa)	P_u (N/mm)	M_u (N·mm/m)	F_fu′ (MPa)	E_f (MPa)	A_f (mm²/m)
1	151	10	7500	4,000,000	3100	160,000	29.78
2	300	5.1	7500	4,000,000	3100	160,000	13.47
3	200	12	7500	4,000,000	3100	160,000	21.51
4	250	12	9500	4,000,000	3100	160,000	15.88
5	250	12	9000	2,200,000	3100	160,000	7.54
6	250	12	9000	5,000,000	2500	160,000	25.94
7	250	12	9000	5,000,000	3500	143,150	18.53

Table 3. Input parameters and optimum results for the independent test cases used to validate the PySR models (unseen data).

Case	t (mm)	F_m (MPa)	P_u (N/mm)	M_u (N·mm/m)	F_fu′ (MPa)	E_f (MPa)	A_f (mm²/m)
1	295	11	5000	5,440,000	3300	155,000	19.18
2	345	10	6500	7,150,000	3500	160,000	20.06
3	195	12	8000	4,500,000	2800	165,000	27.80
4	255	9	7500	5,900,000	3100	146,000	25.37
5	320	18	5500	9,640,000	2600	142,000	40.87
6	180	15	7700	3,000,000	4000	150,000	13.44
7	240	7	7900	3,900,000	4250	152,000	12.32
8	200	20	5750	7,550,000	4500	165,000	30.04
9	275	10	6000	8,100,000	3250	157,000	32.01
10	300	16	6800	4,850,000	2750	148,000	19.14

Table 4. Pareto Front Equations for Model 1 (Balanced Accuracy and Complexity).

Eq.	Complexity	Formula	Train R²	Train RMSE	Test R²	Test RMSE
0	1	$A_{f} = 18.4$	−0.000000	6.204503	−0.001401	6.778847
1	3	$A_{f} = \log ({P_{u}}^{2})$	0.015983	6.154719	0.011842	6.733874
2	4	$A_{f} = 1.20 \times \log (M_{u})$	0.065568	5.997646	0.071399	6.527794
3	5	$A_{f} = \frac{0.639}{E_{f} \times \frac{1}{M_{u}}}$	0.704872	3.370639	0.804105	2.998223
4	6	$A_{f} = 0.0225 \times \sqrt{M_{u}} - 29.7$	0.711341	3.333496	0.815639	2.908618
5	7	$A_{f} = 0.00102 \times \sqrt{\frac{{M_{u}}^{2}}{t^{2}}}$	0.904825	1.914119	0.927989	1.817825
6	8	$A_{f} = \frac{0.00561 \times M_{u}}{t \times \log (t)}$	0.910819	1.852858	0.930564	1.785022
7	10	$A_{f} = \frac{0.00609 \times M_{u}}{t \times \log (t)} - 1.73$	0.917050	1.786961	0.939581	1.665095
8	14	$A_{f} = 4.54 \times {(1 + \frac{0.175 \times M_{u}}{{F_{f u}}^{'} \times t})}^{2}$	0.986341	0.725122	0.988502	0.726389
9	16	$A_{f} = 5.32 {\times (1 + \frac{0.157 \times M_{u}}{{F_{f u}}^{'} \times t})}^{2} - 1.0$	0.986864	0.711125	0.989262	0.701972
10	18	$A_{f} = 11.1 \times {(1 + \frac{0.0968 \times M_{u}}{{F_{f u}}^{'} \times t})}^{2} - 8.46$	0.991305	0.578552	0.992677	0.579700

(Note: The 10th equation was selected as the final Model 1).

Table 5. Pareto Front Equations for Model 2 (Accuracy-Focused).

Eq.	Complexity	Formula	Train R²	Train RMSE	Test R²	Test RMSE
0	1	$A_{f} = 18.4$	−0.000000	6.204503	−0.001401	6.778847
1	3	$A_{f} = M_{u} \times 4.03 \times 10^{- 6}$	0.722402	3.269002	0.814054	2.921090
2	5	$A_{f} = M_{u} \times 0.00102 \times \frac{1}{t}$	0.904825	1.914119	0.927992	1.817790
3	7	$A_{f} = M_{u} \times 3.28 \times \frac{1}{{F_{f u}}^{'} \times t}$	0.982912	0.811047	0.983861	0.860590
4	9	$A_{f} = M_{u} \times 3.70 \times \frac{1}{{F_{f u}}^{'} \times t} - 2.59$	0.996895	0.345750	0.997515	0.337678
5	11	$A_{f} = M_{u} \times 3.54 \times \frac{1}{{F_{f u}}^{'} \times (t - 8.86)} - 2.43$	0.997233	0.326382	0.997624	0.330200
6	13	$A_{f} = M_{u} \times 3.55 \times \frac{1}{({F_{f u}}^{'} + 3.55) (t - 8.73)} - 2.47$	0.997238	0.326089	0.997642	0.328935
7	15	$A_{f} = - 2.44 + \frac{3.55 \times (M_{u} + 0.839)}{({F_{f u}}^{'} + F_{m}) \times (t - 9.43)}$	0.997272	0.324055	0.997658	0.327838
8	17	$A_{f} = M_{u} \times (1.16 \times 10^{- 7} + \frac{3.50}{{F_{f u}}^{'} \times (t + \log (\sqrt{E_{f}}) - 13.7)}) - 2.60$	0.997370	0.318218	0.997788	0.318606

(Note: The 8th equation was selected as the final Model 2).

Table 6. Statistical performance metrics of the proposed PySR models on the training and testing datasets.

Model	Dataset	R²	RMSE	MAE	MAPE (%)
Model 1 (Balanced)	Training	0.9913	0.5786	0.4515	2.76
Model 1 (Balanced)	Testing	0.9927	0.5797	0.4560	2.85
Model 2 (Accuracy)	Training	0.9974	0.3182	0.2265	1.30
Model 2 (Accuracy)	Testing	0.9978	0.3186	0.2248	1.32

Table 7. Predictions on independent test cases using the balanced explicit equation (Model 1).

Case	Optimum Result (A_f in mm²/m)	Model 1 Prediction (A_f in mm²/m)	Percentage Error (%)
1	19.18	17.90	6.67
2	20.06	19.01	5.23
3	27.80	27.42	1.37
4	25.37	24.47	3.55
5	40.87	41.50	1.54
6	13.44	13.40	0.30
7	12.32	12.38	0.49
8	30.04	27.99	6.82
9	32.01	30.66	4.22
10	19.14	18.69	2.36
Average			3.26

Table 8. Predictions on independent test cases using the accuracy-focused explicit equation (Model 2).

Case	Optimum Result (A_f in mm²/m)	Model 2 Prediction (A_f in mm²/m)	Percentage Error (%)
1	19.18	18.41	4.01
2	20.06	19.64	2.09
3	27.80	28.51	2.55
4	25.37	25.40	0.11
5	40.87	40.53	0.83
6	13.44	13.29	1.12
7	12.32	12.76	3.57
8	30.04	31.19	3.83
9	32.01	31.39	1.94
10	19.14	19.33	0.99
Average			2.10

Table 9. Comparison between the current study and existing literature.

Reference	Target Structure	Approach/Method	AI/ML Integration	Model Transparency
Rahman et al. [11]	RC Beams	Optimization (GA)	-	N/A
Yücel [13]	RC Beams	Optimization (FPA, PSO)	-	N/A
Kayabekir et al. [14]	RC Beams	Optimization + Data-driven	ANN	Black-box
Zhang et al. [15]	RC Beams	Data-driven	Ensemble Learning	Black-box
Bekdaş et al. [10]	Cantilever Walls	Optimization (Jaya)	ANN	Black-box
Megahed [22]	RC Deep Beams	Data-driven	Symbolic Regression	White-box
Current Study	URM Walls	Optimization (Jaya)	Symbolic Regression (PySR)	White-box (Explicit)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bekdaş, G.; Khalbous, A.; Nigdeli, S.M.; Işıkdağ, Ü. Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening. Processes 2026, 14, 1163. https://doi.org/10.3390/pr14071163

AMA Style

Bekdaş G, Khalbous A, Nigdeli SM, Işıkdağ Ü. Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening. Processes. 2026; 14(7):1163. https://doi.org/10.3390/pr14071163

Chicago/Turabian Style

Bekdaş, Gebrail, Ammar Khalbous, Sinan Melih Nigdeli, and Ümit Işıkdağ. 2026. "Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening" Processes 14, no. 7: 1163. https://doi.org/10.3390/pr14071163

APA Style

Bekdaş, G., Khalbous, A., Nigdeli, S. M., & Işıkdağ, Ü. (2026). Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening. Processes, 14(7), 1163. https://doi.org/10.3390/pr14071163

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Process-Integrated Optimization and Symbolic Regression for Direct Prediction of CFRP Area in Masonry Wall Strengthening

Abstract

1. Introduction

2. Methodology

2.1. Jaya Algorithm

2.2. Symbolic Regression and PySR

3. Results and Discussion

3.1. Proposed Explicit Design Equations and Pareto Front Analysis

3.2. Predictive Performance and Statistical Accuracy

3.3. Validation on Unseen Data (Independent Test Cases)

3.4. Feature Sensitivity Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI