# Tree Based Approaches for Predicting Concrete Carbonation Coefficient

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

_{2}) and cement hydration products such as Ca (OH)

_{2}and calcium–silicate–hydrate (C-S-H). The process of concrete carbonation begins at exposed surfaces immediately upon exposure to CO

_{2}, and carbonation rates increase for poor quality, porous concretes and grouts. The consumption of Ca (OH)

_{2}reduces the concrete pH to levels where thin oxides film around the steel surface, which protects it from corrosion, becomes unstable, allowing steel corrosion onset [2,3]. Conventionally, concrete carbonation depth at a given time under steady-state conditions can be fairly predicted by Fick’s second law of diffusion in which the significant parameters are claimed to be the water/cement (w/c) ratio, binder constituents and content and the exposure conditions (e.g., relative humidity) [1,2,3,4,5]. These factors can cause the concrete carbonation coefficient to vary a great deal, as they influence the pore system of hardened concrete (affected by w/c ratio), the amount of Ca (OH)

_{2}to react with CO

_{2}(affected by binder type and content), the dissolution of Ca (OH)

_{2}, which is required for it to react with CO

_{2}(affected by relative humidity), and the concentration of CO

_{2}at the concrete surface [5,6]. Developing a holistic and accurate carbonation prediction model is a challenging task, as it is difficult to mathematically describe the several phenomena that occur and, most of all, their interactions. Various theoretical and experimental studies have been carried out to estimate the concrete carbonation depth [7,8,9] and most of them are focused on the estimation of the carbonation rate, through a carbonation coefficient based on the Fick’s second law of diffusion [5]. Even so, it is still difficult to build a prediction model for carbonation depth that can describe all conditions of concrete carbonation.

_{2}were estimated through a neural network algorithm, confirming the decrease in the diffusion coefficient with the increase of relative humidity (RH) and with the decrease of the w/c ratio [10]. Neural network and genetic programming were used to predict the carbonation coefficient and strength of concrete with a correlation coefficient of 0.90 and higher [11]. ANNs have been used for the analysis of the corrosion of steel in concrete to quantify the chloride diffusion in concrete, and it was found that the predictions given by the NNs were adequate [12,13]. Kewalramani and Gupta [14] conducted a study for the prediction of the compressive strength of concrete, using multiple regression analysis and ANNs. Deep neural network architectures provide capabilities to learn hierarchical features from the dataset while providing a more efficient representation than more classic models, improving the generalisation capability of the models [15]. A study by Brusaferria et al. [16] presented a novel probabilistic method for forecasting day-ahead electricity prices based on Bayesian deep learning and deployed a Bayesian inference framework introducing probability distributions over neural network weights and a Gaussian likelihood function. Bayesian linear regression is seen as a suitable tool to revise and update design codes; e.g., the creep correction coefficients are estimated using Bayesian linear regression [17]. Tesfamariam and Martín-Pérez [18] proposed the implementation of a Bayesian Belief Network model for carbonation-induced corrosion of reinforced concrete, which highlighted the impact of various exposure conditions on the rate of carbonation ingress and the potential for reinforcing steel.

## 2. Tree Based Modelling Techniques

#### 2.1. Model Tree (MT)

#### 2.2. Random Forest (RF)

_{i}), i = 1 …}, where {Θ

_{i}} are independent and similarly scattered random vectors and each tree forms a unit vote for the most popular class of input x. The splitting continues until the error goal is met. A bootstrap re-sampling is applied to sample the initial data and create a set of training samples. Each training sample randomly includes the relevant attributes across random subspace techniques to build the decision tree. The optimal result is achieved by voting or averaging method. The analysis of the variable importance in modelling the behaviour of response variables is feasible with RF, using variable importance metrics in two stages [27,28]. Figure 2 shows a Random Forest regression for a typical model.

^{2}coefficient (explanatory variable) by evaluating the validation set or the out-of-bag (OOB) samples through the RF. The importance of that variable is the difference between the starting point and the decrease in overall accuracy or R

^{2}affected by transposing the column, and the results are more reliable [27,28]. The higher the value of out-of-bag, the more important the input [29]. The relative position of a variable used as a decision node in a tree establishes the comparative significance of that variable for the prediction of the explanatory variable.

#### 2.3. Multi-Gene Genetic Programming (MGGP)

_{1}, x

_{2}and x

_{3}. This model structure contains non-linear terms (e.g., the hyperbolic tangent) but is linear in the parameters with respect to the coefficients a0, a1 and a2. In practice, the user specifies the maximum number of genes, G

_{max}, a model is allowed to have and the maximum tree depth, D

_{max}, any gene may have and therefore can exert control over the maximum complexity of the evolved models. In particular, it was found that enforcing stringent tree depth restrictions (i.e., maximum depths of four or five nodes) often allows the evolution of relatively compact models that are linear combinations of low-order non-linear transformations of the input variables.

## 3. Materials and Methods

#### 3.1. Data Used in the Study

_{2}enriched environment). To ensure that the complete data of all variables could be used to determine the value by default, the missing values in any variable were taken as the mean value of the respective [23]. The checking of the data was carried out, and the outliers (points more than three standard deviations away from the mean value of the respective variable) were removed. Further details of the data can be found in [23], while a summary using variable abbreviations is shown in Table 1. These input parameters are the same as those considered in [23], which were selected according to the analysis of variance technique.

#### 3.2. Methodology Adopted

## 4. Results and Discussion

_{2}content, i.e., to check whether CO is ≤3 or >3. If CO is >3, the next branch is selected, and it is again determined whether the CO value is ≤15 or >15. If it is ≤15, the LM 4 equation can be used for the prediction of k. Each variable at each node is analysed by estimating the expected decrease in error. The variable that is selected for separating maximises the expected error reduction at that node. The decision criteria rely on CO

_{2}levels, w/b ratios and compressive strength, corresponding to their higher significance. Furthermore, despite the empirical/statistical nature of the model, it is interesting to notice a positive association of k with w/b ratio and with CO

_{2}content, while the association between k and RH is varying, which is in tune with the fundamental knowledge on concrete carbonation. In general, the negative influence of X and fc is observed through their negative coefficients.

_{2}diffusivity, which, besides depending on the concrete’s porous structure and on its subsequent filling by other substances, particularly water, depends on the CO

_{2}gradient from the concrete surface to carbonation front [23].

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ciampoli, M. Time dependent reliability of structural systems subject to deterioration. Comput. Struct.
**1998**, 67, 29–35. [Google Scholar] [CrossRef] - Ann, K.Y.; Pack, S.W.; Hwang, J.P.; Song, H.W.; Kim, S.H. Service life prediction of a concrete bridge structure subjected to carbonation. Constr. Build. Mater.
**2010**, 24, 1494–1501. [Google Scholar] [CrossRef] - Huang, Q.; Jiang, Z.; Zhang, W.; Gu, X.; Dou, X. Numerical analysis of the effect of coarse aggregate distribution on concrete carbonation. Constr. Build. Mater.
**2012**, 37, 27–35. [Google Scholar] [CrossRef] - Taffese, W.Z.; Al-Neshawy, F.; Sistonen, E.; Ferreira, M. Optimized neural network-based carbonation prediction model. In Proceedings of the International Symposium Non-Destructive Testing in Civil Engineering (NDT-CE) 2015, Berlin, Germany, 15–17 September 2015. [Google Scholar]
- Neville, A.M. Properties of Concrete, 4th ed.; Wiley: New York, NY, USA, 1996. [Google Scholar]
- Neves, R. The Air Permeability and Concrete Carbonation of Concrete in Structures. Ph.D. Thesis, Instituto Superior Técnico, Technical University of Lisbon, Lisbon, Portugal, 2012. [Google Scholar]
- Chang, C.F.; Chen, J.W. The experimental investigation of concrete carbonation depth. Cem. Concr. Res.
**2006**, 36, 1760–1767. [Google Scholar] [CrossRef] - Monteiro, I.; Branco, F.A.; de Brito, J.; Neves, R. Statistical analysis of the carbonation coefficient in open air concrete structures. Constr. Build. Mater.
**2012**, 29, 263–269. [Google Scholar] [CrossRef] - Papadakis, V.G.; Vayenas, C.G.; Fardis, M.N. Fundamental modelling and experimental investigation of concrete carbonation. ACI Mater. J.
**1991**, 88, 363–373. [Google Scholar] - Kwon, S.J.; Song, H.W. Analysis of carbonation behavior in concrete using neural network algorithm and carbonation modeling. Cem. Concr. Res.
**2010**, 40, 119–127. [Google Scholar] [CrossRef] - Londhe, S.N.; Kulkarni, P.S.; Dixit, P.R.; Silva, A.; Neves, R.; de Brito, J. Predicting carbonation coefficient using Artificial neural networks and genetic programming. J. Build. Eng.
**2021**, 39, 1022–1058. [Google Scholar] [CrossRef] - Parthiban, T.; Ravi, R.; Parthiban, G.T.; Srinivasan, S.; Ramakrishnan, K.R.; Raghavan, M. Neural network analysis for corrosion of steel in concrete. Corros. Sci.
**2005**, 47, 625–1642. [Google Scholar] [CrossRef] - Peng, J.; Li, Z.; Ma, B. Neural network analysis of chloride diffusion in concrete. J. Mater. Civ. Eng.
**2002**, 14, 327–333. [Google Scholar] [CrossRef] - Kewalramani, M.A.; Gupta, R. Concrete compressive strength prediction using ultrasonic pulse velocity through artificial neural networks. Autom. Constr.
**2006**, 15, 374–379. [Google Scholar] [CrossRef] - Bengio, Y. Learning deep architecture for AI. Found. Trends Mach. Learn.
**2009**, 2, 1–127. [Google Scholar] [CrossRef] - Brusaferria, A.; Matteuccib, M.; Portolania, P.; Vitalia, A. Bayesian deep learning based method for probabilistic forecast of day-ahead electricity prices. Appl. Energy
**2019**, 250, 1158–1175. [Google Scholar] [CrossRef] - Daou, H.; Raphael, W. A Bayesian regression framework for concrete creep prediction improvement: Application to Eurocode 2 model. Res. Eng. Struct. Mater.
**2021**, 7, 393–411. [Google Scholar] [CrossRef] - Tesfamariam, S.; Martín-Pérez, B. Bayesian Belief Network to Assess Carbonation-Induced Corrosion in Reinforced Concrete. J. Mater. Civ. Eng.
**2008**, 20, 707–717. [Google Scholar] [CrossRef] - Zewdu, W.T.; Sistonen, E.; Puttonen, J. Prediction of Concrete Carbonation Depth using Decision Trees. In Proceedings of the ESANN 2015 Proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 23–23 April 2015. [Google Scholar]
- Murad, Y.Z.; Tarawneh, B.K.; Ashteyat, A.M. Prediction model for concrete carbonation depth using gene expression programming. Comput. Concr.
**2020**, 26, 497–504. [Google Scholar] - Londhe, S.N.; Kulkarni, P.S.; Dixit, P.R. A comparative study of concrete strength prediction using artificial neural network, multigene programming and model tree. Chall. J. Struct. Mech.
**2019**, 5, 1–42. [Google Scholar] - Liu, P.; Wu, X.; Cheng, H.; Zheng, T. Prediction of compressive strength of High-Performance Concrete by Random Forest algorithm. IOP Conf. Ser. Earth Environ. Sci.
**2020**, 552, 012020. [Google Scholar] - Silva, A.; Neves, R.; de Brito, J. Statistical modeling of carbonation in reinforced concrete. Cem. Concr. Compos.
**2014**, 50, 73–81. [Google Scholar] [CrossRef] - Quinlan, J.R. Learning with Continuous Classes. In Proceedings AI”92; Adams, A., Sterling, L., Eds.; World Scientific: Singapore, 1992; pp. 343–348. [Google Scholar]
- Witten, I.H.; Frank, E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations; Morgan Kaufmann: Los Altos, CA, USA, 2000. [Google Scholar]
- Granada, F.; Saroli, M.; de Marinis, G.; Gargano, R. Machine Learning Models for Spring Discharge Forecasting. Geofluids
**2017**, 2018, 8328167. [Google Scholar] [CrossRef] [Green Version] - Breiman, L. Random Forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Tyralis, H.; Georgia, P.; Langousis, A. A brief review of random forests for water scientists and practitioners and their recent history in water resources. Water
**2019**, 11, 910. [Google Scholar] [CrossRef] [Green Version] - Hai-Bang, L.; Tran, V.Q. Estimation of compressive strength of concrete containing manufactured sand by Random Forest. Int. J. Sci. Technol. Res.
**2020**, 9, 564–567. [Google Scholar] - Londhe, S.N.; Dixit, P.R. Genetic programming: A novel computing approach in modeling water flows. In Genetic Programming—In New Approaches and Successful Applications; licensee InTech.16; IntechOpen: London, UK, 2012; Chapter 9. [Google Scholar]
- Searson, D.P.; Leahy, D.E.; Willis, M.J. GPTIPS: An Open-Source Genetic Programming Toolbox for Multigene Symbolic Regression. In Proceedings of the International Multi Conference of Engineers and Computer Scientists, Hong Kong, China, 17–19 March 2010. [Google Scholar]
- Searson, D.P.; Willis, M.J.; Montague, G.A. Co-evolution of non-linear PLS model components. J. Chemom.
**2007**, 2, 592–603. [Google Scholar] [CrossRef] - Pandey, D.S.; Pan, I.; Das, S.; Leahy, J.J.; Kwapinski, W. Multi-gene genetic programming based predictive models for municipal solid waste gasification in a fluidized bed gasifier. Bioresour. Technol.
**2015**, 179, 524–533. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hii, C.; Searson, D.P.; Willis, M.J. Evolving toxicity models using multigene symbolic regression and multiple objectives. Int. J. Mach. Learn. Comput.
**2011**, 1, 30–35. [Google Scholar] [CrossRef] [Green Version] - Hair, J.F.; Black, W.C.; Babin, B.; Anderson, R.E.; Tatham, R.L. Multivariate Data Analysis, 6th ed.; Prentice-Hall Publishers: Englewood Cliffs, NJ, USA, 2007. [Google Scholar]
- Helene, P.R.L. Contribution to the Study of Corrosion of Concrete Reinforcement. Ph.D. Thesis, Polytechnic School, University of São Paulo, São Paulo, Brazil, 1993. (In Portuguese). [Google Scholar]
- Available online: https://waikato.github.io/weka-wiki/downloading_weka/ (accessed on 24 November 2021).
- Jain, A.; KumarJha, S.; Misra, S. Modeling and analysis of concrete slump using Artificial Neural Networks. J. Mater. Civ. Eng.
**2008**, 20, 628–633. [Google Scholar] [CrossRef] - Legates, D.R.; McCabe, G.J., Jr. Evaluating the use of “goodness of fit” measures in hydrological and hydro climatic model validation. Water Resour. Res.
**1991**, 35, 233–241. [Google Scholar] [CrossRef] - Londhe, S.N. Soft computing approach for real-time estimation of missing wave heights. Ocean. Eng.
**2008**, 35, 1080–1089. [Google Scholar] [CrossRef] - Gandomia, A.H.; Mohammadzadeh, D.; Juan Luis Pérez-Ordóñez, S.B.; Alavid, A.H. Linear genetic programming for shear strength prediction of reinforced concrete beams without stirrups. Appl. Soft Comput.
**2014**, 19, 112–120. [Google Scholar] [CrossRef] - Gandomi, A.H.; Atefi, E. Software review: The GPTIPS platform. Genet. Program. Evolvable Mach.
**2020**, 21, 273–280. [Google Scholar] [CrossRef] [Green Version] - Dang, S.; Peng, L.; Zhao, J.; Li, J.; Kong, Z. A Quantile Regression Random Forest-Based Short-Term Load Probabilistic Forecasting Method. Energies
**2022**, 15, 663. [Google Scholar] [CrossRef] - Meinshausen, N. Quantile Regression Forests. J. Mach. Learn. Res.
**2006**, 7, 983–999. [Google Scholar] - Wang, H.; Zhang, Y.-M.; Mao, J.-X. Sparse Gaussian process regression for multi-step ahead forecasting of wind gusts combining numerical weather predictions and on-site measurements. J. Wind Eng. Ind. Aerodyn.
**2022**, 220, 104873. [Google Scholar] [CrossRef]

**Figure 1.**An example of a typical Model Tree (adapted from [26]).

**Figure 2.**Typical Random Forest tree (adapted from [26]).

**Figure 3.**Illustration of a Multi-Gene model (adapted from [26]).

Parameters | Min | Max | Mean | Mode |
---|---|---|---|---|

Clinker (kg/m^{3})—CC | 66.000 | 529.150 | 292.196 | 362.990 |

Clinker/binder ratio (%)—CR | 20.000 | 100.000 | 80.228 | 95 |

28-day compressive strength in MPa—fc | 8.800 | 127.500 | 48.823 | 37.000 |

CO_{2} content—CO | 0.020 | 50.000 | 15.490 | 0.040 |

Number of curing days—d | 7 | 91 | - | 28 |

Water/binder ratio—w/b | 0.240 | 1.000 | 0.501 | 0.370 |

Relative humidity (%)—RH | 50 | 90 | - | 65 |

Exposure class—X | 1 | 3 | - | 1 |

Carbonation coefficient in (mm/year^{0.5})—k | 0.180 | 60.420 | 14.585 | 1.730 |

MGGP Parameters | Parameter Settings |
---|---|

Population size | 500–900 |

Number of generations | 200–500 |

Selection method | Tournament |

Tournament size | 13–15 |

Cross-over rate | 0.78–0.84 |

Mutation rate | 0.14–0.20 |

Termination criteria | 500 generation or fitness value less than 0.00 whichever is earlier. |

Maximum number of genes and tree depth | 4–5 |

Mathematical operations | +, −, ×, /, sin, cos, exp, √, {} |

Term | Value | Weight |
---|---|---|

Bias | 9.83 | 9.83 |

Gene 1 | $-0.138w/b-0.138fc-0.415CO-0.692X$ | −0.138 |

Gene 2 | $451\frac{f{c}^{2}\times w/b\sqrt{CO}}{R{H}^{3}}$ | 451 |

Gene 3 | $\frac{83.2\times {10}^{3}w/b}{RH\left(fc-CO\right)\left(fc-RH+\frac{CO}{w/b}\right)}$ | 83,300 |

Gene 4 | $4.39\sqrt{\frac{RH\times CO}{fc}}$ | 4.39 |

MT | RF | MGGP | |
---|---|---|---|

Time required for modelling | Building the model: 0.08 s Testing the models: 0.01 s | 40.5104 s | 14 min 88 s |

r | 0.953 | 0.955 | 0.936 |

RMSE | 3.871 | 3.584 | 4.453 |

MAE | 2.341 | 2.032 | 2.546 |

Artificial Neural Network (ANNs) | Genetic Programming (GP) | Multiple Linear Regression (MLR) | |
---|---|---|---|

Correlation coefficient—r | 0.940 | 0.937 | 0.917 |

Root mean square error—RMSE | 4.554 | 4.510 | 5.019 |

Mean Absolute Error | 2.991 | 2.598 | 3.371 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Londhe, S.; Kulkarni, P.; Dixit, P.; Silva, A.; Neves, R.; de Brito, J.
Tree Based Approaches for Predicting Concrete Carbonation Coefficient. *Appl. Sci.* **2022**, *12*, 3874.
https://doi.org/10.3390/app12083874

**AMA Style**

Londhe S, Kulkarni P, Dixit P, Silva A, Neves R, de Brito J.
Tree Based Approaches for Predicting Concrete Carbonation Coefficient. *Applied Sciences*. 2022; 12(8):3874.
https://doi.org/10.3390/app12083874

**Chicago/Turabian Style**

Londhe, Shreenivas, Preeti Kulkarni, Pradnya Dixit, Ana Silva, Rui Neves, and Jorge de Brito.
2022. "Tree Based Approaches for Predicting Concrete Carbonation Coefficient" *Applied Sciences* 12, no. 8: 3874.
https://doi.org/10.3390/app12083874