Next Article in Journal
Study on the Properties of Foamed Mixture Lightweight Soil Prepared from Waste Dredged Soil for Ecological Floating Landscapes
Previous Article in Journal
Tailoring Mechanical and Soft Magnetic Properties in (Fe7Co6Ni6)93-xTaxAl7 Multi-Principal Element Alloys: The Role of Ta Addition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Prediction of the Compressive Bearing Capacity of Concrete-Filled Steel Tubes Using Random Forest

1
Guangxi Pinglu Canal Construction Co., Ltd., Nanning 530004, China
2
Guangxi Road Construction Engineering Group Co., Ltd., Nanning 530004, China
3
Guangxi Laboratory of Modern Canal, Nanning 530004, China
4
College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China
*
Author to whom correspondence should be addressed.
Materials 2026, 19(12), 2511; https://doi.org/10.3390/ma19122511
Submission received: 21 April 2026 / Revised: 15 May 2026 / Accepted: 22 May 2026 / Published: 10 June 2026
(This article belongs to the Special Issue Advanced Concrete and Cementitious Composite Materials)

Highlights

  • High-precision RF model for CFST capacity using 154 tests and 24 inputs.
  • Steel tube inertia (Is) dominates prediction (35.56% contribution).
  • Validated by nine CFST tests with prediction errors within 5%.

Abstract

Concrete-filled steel tube (CFST) members are widely used in long-span and high-rise structures due to their high load-bearing capacity and structural efficiency. Accurate prediction of their compressive bearing capacity is essential for reliable design. In this study, a data-driven prediction model based on the Random Forest (RF) algorithm was developed using a database of 154 axial compression tests. A total of 24 parameters, including geometric dimensions, material properties, and sectional characteristics, were considered as input variables, and the model was optimized through five-fold cross-validation and hyperparameter tuning. The results indicate that the proposed model achieves high accuracy and stability, with mean predicted-to-experimental ratios of 1.002 and 0.989 for the training and testing sets, respectively, and maximum deviations within 15%. Compared with existing design codes and alternative machine learning methods, the RF model improves prediction accuracy by approximately 9% and exhibits strong generalization capability. Furthermore, independent experimental validation using nine CFST column tests confirms its reliability, with prediction errors within 5%. These findings demonstrate that the proposed model provides an effective and practical tool for predicting the compressive bearing capacity of CFST members in engineering applications.

1. Introduction

Steel-concrete composite structures have been extensively applied in engineering fields such as bridges, high-rise buildings, and large-span spatial structures, owing to their outstanding advantages, including high bearing capacity, large stiffness, excellent seismic performance, and significant economic benefits [1]. To illustrate these structural advantages in practice, Figure 1 and Figure 2 present typical engineering applications where CFST members serve as critical load-bearing components. Specifically, Figure 1 shows the application of CFST in practical engineering, while Figure 2 illustrates concrete-filled double skin steel tubular (CFDST) structures in power transmission engineering. Representative examples also include super high-rise buildings like the SEG Plaza and CITIC Plaza [2,3,4]. Furthermore, these structures are widely utilized in large-scale infrastructure projects, such as the Pinglu Canal housing construction, which adopts a “permanent–temporary integrated” construction mode, wherein CFST-based composite structural members are extensively employed as critical load-bearing components. This practice provides a replicable engineering paradigm for the broader application of steel-concrete composite structures in major infrastructure projects [5]. However, as complex hybrid systems, the mechanical behaviors of structures in steel-concrete composite structures, such as concrete-filled double skin steel tubular (CFDST) structures, as shown in Figure 2, exhibit mechanical behaviors influenced by the coupling effects of various complex factors, including material nonlinearity, geometric nonlinearity, interface slip, construction errors, and initial imperfections [2,6]. Accurately predicting the ultimate bearing capacity of members under various loading conditions has become a core challenge in structural design and safety assessment [7,8,9]. Traditional calculation methods based on superposition theory ignore the complex interactions between steel and concrete, resulting in limited computational accuracy. In particular, when dealing with high-strength materials, complex stress states, or novel section types, these methods face severe challenges in terms of applicability and accuracy [10].
Researchers studying the ultimate bearing capacity of traditional steel-concrete composite structures typically employ unified theory and superposition theory as theoretical models for calculating ultimate bearing capacity [11]. Unified theory treats the CFST components in steel-concrete composite structures as an integrated unit, solving for ultimate load through equilibrium differential equations that consider material nonlinearity and geometric imperfections. While it can better reflect the interactions between materials, its mathematical derivation is complex and requires numerous simplifying assumptions, limiting its application under complex loading conditions [12,13]. In contrast, superposition theory is simpler and more practical, treating bearing capacity as the simple superposition of the bearing capacities of steel and concrete, and roughly considering confinement effects through confinement coefficients [14]. While this theory facilitates engineering design, it cannot be applied to torsional members and fails to account for the complex interactions between steel and concrete, presenting deficiencies in calculating ultimate axial compressive strength [15].
To meet the demands of modern engineering, researchers have begun exploring the engineering applications of machine learning algorithms. While classical algorithms like support vector machines excel at handling small-sample and nonlinear problems but are sensitive to hyperparameters and exhibit lower training efficiency [16,17]. In contrast, recent advancements in Deep Learning (DL) and artificial neural networks possess powerful function approximation capabilities; they heavily rely on massive datasets and suffer from poor interpretability, making them less ideal for structural engineering, where experimental data is often limited and physical interpretability is mandatory [18]. For the complex regression problem of CFST ultimate bearing capacity, which involves high nonlinearity and parameter coupling, ensemble learning methods demonstrate significant advantages. Although gradient boosting decision trees slightly outperform Random Forests in accuracy, they require longer training times and are more sensitive to outliers [19,20]. Conversely, the Random Forest (RF) algorithm reduces the risk of over-parameterization through ensemble learning, provides robust generalization on small-to-medium datasets, and uniquely offers feature importance evaluation, which is crucial for engineering validation and physical interpretability. Meanwhile, due to parallel training characteristics and robustness to noise, Random Forests demonstrate superior performance in computational efficiency and model stability.
As an ensemble learning algorithm, Random Forest reduces the risk of over-parameterization by constructing multiple decision trees, featuring high robustness and generalization capability, insensitivity to noise, and the ability to evaluate feature importance [21,22]. It has been widely applied in fields such as ground surface settlement prediction and slope stability analysis [23,24,25,26]. For the complex regression problem of ultimate bearing capacity of steel-concrete composite structural members, which is highly nonlinear and involves parameter coupling [27], Random Forest demonstrates natural applicability, enabling high-precision prediction for axially and eccentrically compressed members [28,29]. Meanwhile, the field of steel-concrete composite structures has accumulated abundant experimental and numerical simulation data, providing a solid foundation for training high-precision models [21,30,31,32]. In summary, with high precision, high efficiency, strong robustness, certain interpretability, and extensive data support, Random Forest has become an ideal tool for solving the complex nonlinear regression problem of ultimate bearing capacity prediction for CFST members.
A Random Forest-based machine learning predictive model is proposed in this study to estimate the ultimate bearing capacity of CFST members. To validate the model, a series of axial compression tests was performed on CFST specimens, involving comprehensive characterization of the material properties of steel tubes and core concrete, as well as measurement of the ultimate compressive capacities. The results indicate that the RF model yields predictions in close agreement with the experimental data, thereby verifying both its computational accuracy and practical applicability. Ultimately, the novelty of this study lies in explicitly quantifying the structural mechanics hierarchy through data-driven feature importance, revealing that the steel tube inertia dominates the prediction with a 35.56% contribution. Unlike existing black-box ML models that focus solely on prediction accuracy, this quantification provides new physical insights that complement classical CFST theory, demonstrating that the top seven parameters collectively account for over 80% of the variance, which aligns with the confinement mechanism. Furthermore, the proposed model achieves a mean predicted-to-experimental ratio of 0.989 for the testing set with maximum deviations within 15%, improving prediction accuracy by approximately 9% compared with existing design codes, and is rigorously validated by nine independent CFST column tests with prediction errors within 5%.

2. Experimental Program

2.1. Experimental Database

To accurately predict the ultimate bearing capacity of steel-concrete components using a Random Forest (RF) model, a comprehensive dataset of axially compressed specimens is essential. This study compiled 154 experimental datasets from the literature [33,34,35,36,37,38,39,40,41,42,43], encompassing a wide range of material strengths and slenderness ratios: 18.0 ≤ D/t ≤ 165, 8.4 ≤ λ ≤ 168, 216 ≤ fy ≤ 617.8, and 16 ≤ fck ≤ 121.1, ensuring good sampling comprehensiveness. The first 24 parameters, including structural dimensions such as the diameter (D) and wall thickness (t) of the concrete-filled steel tube and the column length (L), serve as the input data. The 25th parameter, bearing capacity (Nt), is the sole output for prediction. Descriptive statistics were computed for the entire dataset, including count, mean, standard deviation (std), minimum (min), 25th percentile (25%), median (50%), 75th percentile (75%), and maximum values (max), as shown in Table 1.
The dataset, comprising 154 samples, provides a sufficient basis for model training and validation, as it comprehensively encompasses the critical parameter space, and data characteristics were further analyzed using variable boxplots and violin plots.
Based on the component model variables, we conducted a detailed analysis of the overall distribution of the data. After incorporating the feature data, we plotted a boxplot, as illustrated in Figure 3.
The boxplot indicates that features D, t, L, fy, fcu, fc and fck are symmetrically distributed around the mean, with upper and lower quartiles within [−2.5, 2.5]. Features λ, ξ, and C1 show pronounced outliers, with λ mainly below the mean and ξ and C1 primarily above it. Features lc, ls, Es, B and fym cluster around the mean, exhibiting minimal variance. Due to the large number of variables, violin plots were employed to present the distributions more clearly and concisely, as shown in Figure 4.
The violin plot indicates that features L, fcu, fc, fck, Ec and C exhibit bimodal distributions, while t, fy, λ, Ac, As, and fmy display unimodal distributions. Features Ac, As, lc, ls, and ξ show long upper tails, whereas λ and Es have long lower tails. These long-tailed patterns represent valid outliers arising from natural variability in data generation rather than measurement errors, and therefore do not compromise the integrity or suitability of the dataset for analysis.

2.2. Data Quality and Heterogeneity

The 154 groups of experimental data sources compiled in this paper provide rich sample diversity for model training. However, multi-source data fusion inevitably introduces heterogeneity, so it is necessary to carefully evaluate the data quality. In terms of data consistency, different studies may have differences in the loading system, end constraint conditions and measurement methods of specimen geometry. The descriptive statistical results of some variables in Table 1 and the outliers presented by λ, ξ and C1 in Figure 3 and Figure 4 not only reflect the natural variability of the parameters themselves, but also may be related to the systematic differences in test conditions between studies. This study retains these outliers for analysis, on the one hand, because they conform to the real physical mechanism of data generation, on the other hand, eliminating outliers may lead to the prediction distortion of the model in the extreme parameter range.
In terms of measurement uncertainty, the determination of steel yield strength fy depends on the tensile test equipment and sample processing accuracy in different laboratories. The compressive strength of concrete cube fcu is affected by factors such as curing conditions, loading rate and the flatness of the end face of the test block, and the point selection method of Ec may also be inconsistent among different sources. In view of the fact that most of the source documents do not completely report the above measurement details, this paper has uniformly adopted the measured mean value of each document report in the process of data collection, and assumed that the measurement errors of each source were randomly distributed within the acceptable engineering tolerance.

3. Research Methodology

3.1. Basic Principles of Random Forest Method

RF utilizes ensemble learning, consisting of multiple classification and regression trees (CARTs). A decision tree is a statistical model that outputs different classes or values based on input features [44]. Results are obtained by randomly selecting features for each tree, and then majority voting or averaging is applied based on the specific problem. The final prediction is a stable and accurate result, as shown in Equation (1). Given input data {H (x, θi), i = 1, 2, …, k}, the Random Forest prediction is the average of all individual decision tree predictions {H (x, θi)}.
H ¯ x = 1 k i = 1 k H x , θ i
In the formula, H (x) represents the predicted value of the Random Forest model, θi denotes the random variable of a single decision tree, x is the feature variable, and k is the number of decision trees. The Random Forest algorithm uses bootstrapping, randomly sampling k times from the original data repository, with each decision tree trained on a sample of the same size as the original. Each sample set includes data from the out-of-bag (OOB) data, as shown in Equation (2). The q training sample sets are drawn using Bootstrap sampling to construct q decision trees. The input sample set is
D = x 1 , y 1 , x 2 , y 2 , , x N , y N
Among them, the weak classifier has a certain number of iterations, and the final strong classifier H (x) is the output.

3.2. Model Development

Random Forest (RF) comprises multiple independent decision trees, each structured with nodes, branches, and leaf nodes, where nodes test specific attributes to split the tree. Individual decision trees are highly sensitive to training data and prone to over-parametrization [44]. The RF algorithm mitigates this issue through bagging [45], enhancing robustness. In this study, the number of trees was set between 10 and 200, with 5-fold cross-validation applied. Key parameters—maximum number of trees, maximum depth, minimum samples per leaf, minimum samples per split, and maximum features—were optimized based on their effects on mean squared error (MSE) to achieve optimal performance. It is possible that additional validation methods could also be worth considering. That said, the approaches currently adopted might perhaps be regarded as providing a relatively thorough evaluation of predictive accuracy.

3.2.1. 5-Fold Cross-Validation Results

Following K-fold cross-validation principles for Random Forest [46], modeling was conducted using 5-fold cross-validation. The dataset was split into 80% for training and 20% for testing. The component database underwent preprocessing prior to modeling and analysis, with the procedure outlined as follows:
(1)
Input the following sample set:
D = x 1 , y 1 , x 2 , y 2 , , x N , y N , N = 123
Among them, the number of iterations for the weak classifier T = 25, and the final strong classifier H (x) is output.
(2)
For T = 1, 2, …, 25, the training set undergoes the T-th bootstrap sampling, repeated 123 times to generate a sampling set DT of 123 samples. The model comprises 25 decision trees. For instance, in constructing the 19th tree, 123 bootstrap iterations on the 123 training samples yield 79 unique samples, while m features are randomly selected for splitting. These 79 samples serve as the training set for model computation and output generation:
x j = A s , s = 0.223
Feature As is chosen as the root node, splitting the binary tree into left and right subtrees at −0.223. These values are then used to determine the optimal split boundary j and split point s, producing the current output.
  v a l u e = 7.272 , s q u a r e d _ e r r o r = 0.427
The 79 sample points are evaluated against the condition As ≤ −0.223. An amount of 38 points satisfies the condition, while 41 do not. The 38 satisfying points are used in the calculation to obtain xj = Ac and s = −0.756, which is then used to compute the current output value.
  v a l u e = 6.752 , s q u a r e d _ e r r o r = 0.231
The 38 sample points are evaluated against As ≤ −0.756, resulting in 6 valid and 32 invalid samples. The process is repeated for the 6 valid points to determine xj = fscg and s = −0.882, which is then used to compute the current output value.
  v a l u e = 5.894 , s q u a r e d _ e r r o r = 0.003
The 6 sample points are evaluated against fscg ≤ −0.882, with 4 satisfying the condition and 2 not. The procedure is repeated for the 4 satisfying points to determine xj = ξ and s = −0.255, which are then applied to compute the current output value.
  v a l u e = 5.858 , s q u a r e d _ e r r o r = 0
The stopping condition is met by evaluating the four sample points against ξ ≤ −0.255, with two nodes satisfying the condition, producing an output of 5.841 and forming a leaf node. Recursively returning to the previous node, the remaining two nodes are false, yielding an output of 5.874 with zero squared error, forming another leaf node. Similarly, two samples with fscg ≤ −0.882 are false, resulting in an output of 5.968 and a squared error of 0, creating an additional leaf node. Continuing this recursive process constructs the complete 19th decision tree, 19D, as shown in Figure 5.
The decision tree construction is repeated 25 times, generating 25 trees. Bagging is applied, and the regression outputs are averaged to form the final Random Forest prediction model. The predictions are evaluated using appropriate metrics on 123 samples.
M S E = L i = 1 y o b s y pred 2 / L
The calculation yields MSE = 0.0081; as further validation, calculating R2 and MAE ensures the accuracy of the model and adds more robust evaluation metrics.
R 2 = 1 S S r e s S S tot      
The calculation result shows that R2 ≈ 0.9791.
1 n | y i y ^ i |    
The calculation result shows that MAE = 112.33. Based on the range of true values (348~5927), the proportion of MAE is relatively small, consistent with R2 = 0.979 and MSE = 0.0081, indicating that the model has high accuracy.
3.2.2. Parameter Combination Optimization
The decision tree parameters—max decision trees, max depth, min sample leaf, min samples split, and max features—along with the training set division ratio, are optimized to improve the accuracy of the RF prediction model. These parameters are combined and fine-tuned to achieve the best prediction performance.
The optimization results indicate that the number of decision trees stabilizes at an MSE of 120, with 25 trees selected for optimal accuracy and efficiency, as shown in Figure 6. The maximum subtree depth achieves stable MSE beyond a depth of 5, and a depth of 10 prevents over-parametrization while maintaining accuracy, as shown in Figure 7. The minimum leaf samples are optimal at 2, as shown in Figure 8, and the minimum split samples perform best at the default value of 2, as shown in Figure 9. The maximum number of features is optimal at 10, as shown in Figure 10, ensuring low MSE and model stability. An 80% training set proportion provides low MSE and reliable results, as shown in Figure 11.
The optimization results indicate that the number of decision trees stabilizes at an MSE of 120, with 25 trees selected for optimal accuracy and efficiency. Maximum subtree depth stabilizes after 5, and a depth of 10 prevents over-parametrization while maintaining accuracy. The minimum leaf samples are optimal at 2, and the minimum split samples perform best at the default of 2. The maximum number of features is optimal at 10, ensuring low MSE and stability. An 80% training set proportion produces low MSE and reliable results.

4. Model Prediction Results

4.1. Result Coefficient

The importance coefficients from the optimized parameter model in Section 3.2.2 are shown in Figure 12. The contribution of each independent variable to the RF model is based on the largest or relatively significant coefficients. Among the 24 features, Is contributes the most, accounting for 35.56%. As, D, Ic, L, Ac, and t account for 17.25%, 8.74%, 8.33%, 5.01%, 4.78%, and 3.31%, respectively. Ec, C, fscg, fck, fyr, and λ contribute between 1.21% and 2.44%, with respective contributions of 2.44%, 2.38%, 1.42%, 1.31%, 1.28%, and 1.21%. Other variables contribute less than 1%. This shows that the ultimate bearing capacity is primarily influenced by Is, with factors like As, D, Ic, L, Ac, and t also playing important roles. In practical engineering, careful attention should be given to selecting values for these variables. Although the impact of variables like Ec, C, fscg, fck, fyr, and λ is smaller, they should be adjusted with appropriate reduction factors when constructing the bearing capacity equation for comprehensive analysis.
While the Random Forest model assigns feature importance values (e.g., Is = 35.56%), it is essential to link these results to structural mechanics principles to provide engineering insight. For instance, the high contribution of the steel ratio (Is) aligns with classical CFST column theory, where the steel content significantly affects both confinement and ductility. Higher steel ratios improve the column’s ability to resist axial loads and enhance post-yield behavior due to the interaction between steel and concrete.
Similarly, the concrete compressive strength, which also shows notable importance, directly influences the ultimate axial capacity. According to the standard design formula, the axial strength of CFST columns is a function of both steel and concrete contributions. The model’s feature importance reflects this underlying physical relationship, validating that the ML predictions are not only statistically accurate but also physically meaningful.
This interpretability enables engineers to understand which parameters most strongly affect performance, supporting informed design and optimization of CFST structures, beyond the purely predictive capability of the model.
Optimized training samples from 154 datasets capture the macroscopic mechanical behavior of component ultimate bearing capacity under various conditions. Statistical results are presented in Table 2, enabling a clear comparison between predicted and experimental values. As shown in Figure 13a and Figure 14a, the RF model predictions for the training (80%) and test (20%) sets closely align with experimental results, with curves nearly overlapping, indicating minimal deviation and high accuracy. Figure 13b and Figure 14b show that predicted points are evenly distributed, with most coinciding with the isoline and maximum deviations within 15%, confirming strong generalization and avoidance of overfitting. These results validate the improved RF model’s predictive reliability, supporting rapid and accurate estimation of ultimate bearing capacity for concrete-filled steel tubes.
Table 3 presents the average value (AVG), mean square deviation (σ), and coefficient of variation (COV) for the ratio between predicted and experimental values. The low σ and COV indicate minimal deviation between predicted and actual values.
Figure 15a,b show the histogram distributions of predicted-to-experimental ratios of ultimate bearing capacity for the training and testing datasets. For both datasets, the median and mean of the ratio distributions are close (1.0008 and 1.0020 for training, 0.9849 and 0.9890 for testing), indicating minimal bias and high accuracy of the improved RF prediction model.
For practical engineering applications, the proposed RF model can be deployed as an interactive computational tool or Application Programming Interface (API), enabling the direct input of the 24 parameters for rapid bearing capacity estimation without requiring machine learning expertise. Based on the feature importance analysis, priority should be given to the selection of Is, As and D during the preliminary design phase, as these parameters dominate the prediction. However, the practical application of this model has inherent limitations: Predictions are reliable only within the parameter ranges of the training dataset; extrapolation beyond these bounds may lead to inaccuracies. The current model does not explicitly account for long-term loading effects, local buckling, or initial geometric imperfections beyond their implicit capture in the experimental dataset. Future work should incorporate numerical simulation data to expand the applicable parameter space and physical constraints.

4.2. Discussion on Physical Interpretability

The RF is a typical data-driven model, and the order of characteristic importance revealed by it is highly consistent with the classical mechanical behavior of CFST, which provides a strong physical basis for the prediction results of the model.
The results showed that Is (35.56%), Ic (8.33%), D (8.74%) and L (5.01%) were the most important characteristics. These parameters jointly define the slenderness ratio and section bending stiffness. In structural mechanics, slenderness ratio and flexural stiffness are the core parameters to control whether the overall buckling instability of members occurs under axial load. Especially for long- and medium-length columns, the ultimate bearing capacity is far lower than the section strength, which is mainly determined by the stability performance. This shows that the ranking of characteristic importance of the Random Forest model is not a random statistical result, but an accurate mapping of the physical reality that, under axial compression, the contribution of geometric stiffness is greater than that of material strength, which conforms to the basic principles of structural mechanics.

5. Comparative Analysis of Model Effects

5.1. Compare with Standard Procedures of Various Countries

Based on the experimental database, the results of the ultimate bearing capacity N0 of components in various national standard specifications are calculated and compared with the results of the RF prediction model.
The ultimate bearing capacity N0 of components in the GB 50936-2014 [47] is
N 0 = A s c f s c = ( 1.212 + B ξ + C ξ 2 ) f c
B = 0.176 f s / 213 + 0.974
C = 0.104 f c / 14.4 + 0.031
where fsc and Asc represent the design value of composite section strength and section area, respectively; B and C are coefficients; ξ is the confinement coefficient of the member; and fc is the design value of concrete compressive strength.
The DBJ/T13-51-2010 [48] has a similar expression format to the aforementioned standards, but adopts different compressive strengths for CFST sections:
f s c = ( 1.14 + 1.02 ξ ) f c
The AISC [49] specification calculates the axial compressive strength capacity based on the width-to-thickness ratio in three scenarios:
N 0 = f y A s + 0.95 f c A c if   λ λ p N 0 = P p [ P p ( f y A s + 0.7 f c A c ) ] ( λ λ p ) 2 / ( λ r λ p ) 2 if   λ p < λ λ r N 0 = 0.72 f y A s / [ ( D / t ) f y / E s ) ] 0.2 + 0.7 f c A c if   λ > λ r
where λp and λ0 represent the limit values of the generalized width-thickness ratio.
The European Committee for Standardization EC4 Design Code for Steel-Concrete Composite Structures [50]:
  N 0 = η s f y A s + ( 1 + η c t D f y f c ) f c A c
where ηs and ηc represent the calculated parameters, respectively; fc denotes the compressive strength of concrete cylinders.
The AIJ Code of the Architectural Institute of Japan [51]:
  N 0 = 1.27 A s f s + 0.85 f c A c
  f s = min ( f y , 0.7 f u )
where As and Ac represent the areas of steel tube and concrete, respectively; fc denotes the compressive strength of the concrete cylinder; fs stands for the standard value of steel strength; fy signifies the yield strength of steel; and fu indicates the ultimate tensile strength.
British Standards Committee BS5400 Bridge Design Code [52]:
  N 0 = 0.95 A s f y + 0.45 A c f c c
where fy and fcc represent the strength indices of the steel tube and concrete, respectively.
A comprehensive analysis of component ultimate bearing capacity considering data volume, calculation accuracy, and applicability of different standards is presented in Table 4 and Figure 16. Ratios of calculated results N0 from various standards to experimental results Nt, along with mean values and coefficients of variation, are evaluated. Results indicate that AISC, AIJ, and BS5400 generally underestimate ultimate bearing capacity by over 10%, reflecting overly conservative predictions. In contrast, EC4 results tend to be on the unsafe side, with the landmark exhibiting the poorest stability. Using these two calculation methods produces unreasonable results. In contrast, predictions based on the unified national and landmark standards achieve relatively good accuracy, with the national standard showing an overall error of approximately 4%, yet still lower than that of the RF model developed in this study. The proposed RF model exhibits an overall error of about 1%, offering the highest accuracy and best stability. Therefore, the RF model provides a more accurate and reliable prediction of component ultimate bearing capacity.

5.2. Compared with Other Algorithm Models

To further evaluate algorithm performance, six machine learning models and one deep learning model were applied for data modeling. In addition to the Random Forest model, benchmark algorithms include Decision Tree, Ridge Regression, k-Nearest Neighbors (KNN), AdaBoost, Support Vector Machine (SVM), and Backpropagation (BP) neural network. Data preprocessing, training set partitioning, 5-fold cross-validation, parameter tuning, and hyperparameter optimization were applied consistently across all models.
Comparative results are presented in Table 5 and Figure 17. These show that the BP neural network model has the highest index but relatively poor performance. KNN and Ridge Regression exhibit identical indices, reflecting similar predictive ability. The Decision Tree model performs better than these, yet remains inferior to the Random Forest model, which achieves the lowest index of 0.0081, indicating superior performance. Thus, the Random Forest algorithm provides the most accurate prediction of ultimate bearing capacity for CFST components.
Figure 18 and Figure 19 show that the Decision Tree model achieves good curve fitting, with data points evenly distributed around the isoline and a maximum deviation within 20%, though stability is lower than the Random Forest model. Ridge Regression and KNN have maximum deviations within 31%, with KNN exhibiting better stability. AdaBoost performs slightly worse than the Decision Tree, with a maximum deviation of 26% and moderate stability. SVM and BP neural network models perform the poorest, with maximum deviations of 37% and 42%, and scattered data points. In contrast, the Random Forest model, shown in Figure 14b, produces evenly distributed points largely overlapping the isoline, with a maximum deviation within 15%. Overall, while the proposed model demonstrates significant improvements over existing models, these results should be interpreted with caution. The observed performance gain partly stems from the model tuning processes, and the model’s predictions are currently limited to columns of the heights tested. Further studies are required to generalize these findings to different loading conditions and structural configurations. The improved model predictions closely match experimental results, exhibiting minimal deviation and superior performance.
This study addresses the regression prediction task on structured tabular data (24-dimensional features, 154 samples). In such small-scale structured data scenarios, models such as Random Forest and BP neural network are recognized as mainstream benchmarks in the academic community, whereas CNN and GNN are primarily designed for image or topological data. Forced adaptation not only fails to leverage their structural advantages but may also introduce overfitting risks due to over-parameterization. The input features systematically integrate mechanical mechanism parameters, including section constraint effect, shape coefficient, and composite strength, and are comprehensively compared with the design codes of six countries. This effectively internalizes domain knowledge from traditional physical models into a data-driven framework, establishing an indirect benchmark against physics-informed methods. Furthermore, as demonstrated in Figure 19, extensive experiments involving seven types of models (covering classical regression, ensemble learning, and basic deep networks) confirm the significant advantages of Random Forest in prediction accuracy and robustness, which remain sufficiently stable within the current comparative framework.

6. Experimental Verification

6.1. Experimental Design and Discussion

To validate the accuracy of the Random Forest prediction model for ultimate bearing capacity under uniaxial compression, nine concrete-filled steel tubular column specimens were tested using a YAW-10000J microcomputer-controlled electro-hydraulic servo compression-shear machine (Beijing Times peak Technology Co., Ltd., Beijing, China). The measured ultimate capacities were compared with the predictions from the RF model.

6.1.1. Mechanical Property Indicators of Steel Tubes

The steel material property test used samples from the same batch of steel as the steel tubes, with three specimens (A1–A3). The steel tubes were processed into standard test specimens, and their basic mechanical properties were measured using the metal tensile testing method. The test was displacement-controlled, with a loading speed of 1 mm/min [53]. The shape and dimensions of the test specimens followed the standard specifications.
The steel tube material properties were tested using a microcomputer-controlled electronic universal testing machine (Changchun new testing machine Co., Ltd., Changchun, China). The force control system employs a fully digital AC servo controller, while deformation is measured via the relative displacement of the upper and lower clamps, processed through frequency doubling and shaping by digital circuits.
The average test results for the yield limit fy, ultimate strength fu, and elastic modulus Es of each specimen material are presented in Table 6.

6.1.2. Mechanical Performance Indicators of Concrete

The concrete strengths were divided into two types: C50 and C30, using Conch PO42.5 cement and PO32.5 cement, respectively. The sand used was medium-coarse sand, and the stone particle size ranged from 5 to 35 mm. The mix proportion for C50 concrete (cement/sand/stone/water) was 1:1.04:2.21:0.34, while the mix proportion for C30 concrete (cement/sand/stone/water) was 1:1.24:2.85:0.45, with the addition of high-strength water reducer FDN at 1% of the cement content. For each grade of concrete, two sets of 150 × 150 × 150 mm3 concrete cube test blocks were reserved and cured under the same conditions as the test specimens, for measuring the compressive strength (fcu) of the concrete. After the concrete test blocks reached the specified age, uniaxial compressive tests were conducted using the YZ200A pressure testing machine (Hongshan testing machine factory, Wuhan, China), following the test method specified in GB/T50081-2002 [54]. It is worth noting that while standard uniaxial tests suffice for CFST components, the mechanical responses of larger-scale concrete structures involve more complex nonlinear coupling and dynamic interactions, such as the seismic fluid–structure interactions in concrete gravity dams [55] and the dynamic deformation monitoring in high-rise structures [56]. Each set of test blocks consisted of 3 samples (with their average values taken), for a total of 4 sets and 12 test blocks. The test block numbering followed the example of C30-07-01, where C30 indicates the concrete strength, 07 indicates the test pressure at 7 days, and 01 indicates the number of the test block in that set. The test results of each test block were comprehensively considered during the calculation and analysis process. The material property test results of the concrete are shown in Table 7.

6.1.3. Specimen Size Parameters

The experiment used steel tubes with uniform cross-sectional dimensions of Φ273 × 6 mm, an outer diameter of 273 mm, a wall thickness of 6 mm, and material grade Q235B. The steel tube components were spiral tubes. Parameters for each test specimen are listed in Table 8. The test piece should be processed strictly according to the dimensions shown in Figure 20.

6.1.4. Specimen Fabrication and Loading

During specimen casting, concrete was densely vibrated and compacted every 30 cm using a Φ50 immersion vibrator. At mid-height, a self-fabricated epoxy-coated steel mesh frame was embedded into the column. Pouring continued for the remaining concrete, compacted by manual vibration and uniform tapping of the steel tube, followed by surface finishing. Specimens were cured at room temperature with regular watering for 28 days.

6.1.5. Test Piece Testing Process

As shown in Figure 21, the YAW-10000J microcomputer-controlled electro-hydraulic servo testing machine applies load to the specimen. Before the eccentric compression test, two Φ8 steel bars were welded on the upper and lower pads at the designated eccentric distances to form a groove with the base plate, preventing roller shaft slippage during installation and loading. A triangular block was placed under the side opposite the eccentric force to stabilize the specimen, and removed slowly after loading. Loading was terminated, and data was recorded when the specimen’s bearing capacity fell below 80% of the peak load or severe deformation occurred.

6.1.6. Ultimate Bearing Capacity of Test Specimen

The failure modes of all test specimens are shown in Figure 22. Axial compression specimens (A1, B1, and C1) exhibit similar behavior, demonstrating good ductility and post-peak load-bearing capacity. During the elastic phase, no notable changes occur. At approximately 80% of peak load, cross-diagonal cracks form on the outer steel tube wall. These cracks propagate, with rust layers peeling off, indicating yielding. As the load increases, buckling occurs at vulnerable locations such as edges, weld ends, or openings with stress concentration, accompanied by a sharp increase in circumferential strain and intensified confinement. Upon failure, significant bulging appears at the ends and midsection of the specimen, with diagonal shear slip.
All eccentric compression specimens exhibited slow increases in strain and deflection during initial loading, with no significant deformation. As vertical load approached the ultimate capacity, strain and deflection increased rapidly, with midpoint displacement exceeding that at the quarter points, indicating pronounced mid-span flexural deformation. During the descending or plateau phase, steel tubes in the top and bottom compression zones buckled, showing evident bulging. Maximum lateral displacement occurred at the specimen midsection. The transverse deformation profile resembled a half-wave sine curve, with a concave center and outward bulging at the ends. Upon failure, steel tube rupture and concrete cracking were audible, and most strain gauge readings overflowed, terminating the test. The test results of the ultimate bearing capacity of each component are shown in Table 9.

6.2. Model Validation

This section analyzes and verifies the predictive effectiveness of the RF model for ultimate bearing capacity using CFST component loading test data. As shown in Table 10 and Figure 23, the curve of ultimate bearing capacity predicted by the Random Forest model closely aligns with experimental values, indicating minimal deviation and high accuracy. The RF model effectively captures the mechanical behavior of CFST components under various conditions, providing accurate and reliable predictions.
Using the Random Forest algorithm model, we obtained predicted values for the ultimate bearing capacity of components. The mean (AVG) and coefficient of variation (δ) of the ratio between predicted (N) and experimental (Nt) values were calculated, as shown in Figure 24 and Table 10. The predicted data points are nearly evenly distributed, indicating minimal error between predicted and experimental values. The data points are symmetrically distributed around the isoline, with most coinciding with it. The maximum deviation is within 5%, representing a 10% improvement in prediction accuracy over previous models. It should be noted that the model was primarily trained for axial compression loads, and its accuracy in eccentric loading scenarios may be limited and requires further validation. However, this still confirms the effectiveness of the model’s experimental prediction verification. The predicted ultimate bearing capacity closely matches experimental values, demonstrating excellent training results. Thus, the Random Forest algorithm model is both feasible and efficient for predicting the ultimate bearing capacity of CFST components in this paper.
Comparison of predicted ultimate bearing capacity from the RF model shows that the mean of the test validation model is 0.992, a 0.3% improvement over the predicted model mean of 0.989. The test validation model exhibits a mean square deviation of 0.017 and a coefficient of variation of 0.017, compared to 0.093 and 0.094 for the predicted model, representing reductions of 81.7% and 81.9%, respectively. These results confirm the model’s prediction stability, with reduced deviation variance indicating excellent performance.

6.3. Uncertainty Quantification and Sensitivity Analysis

In this study, the uncertainty quantification and sensitivity are quantified by the statistical distribution of the ratio of the predicted value to the experimental value. As shown in Table 3 and Figure 15, the AVG of the training set is 1.002 and the test set is 0.989, both of which are close to 1, indicating that the overall prediction deviation is very small; σ and COV of the training set were 0.058 and 0.058, respectively, while those of the test set were 0.093 and 0.094, reflecting the low degree of dispersion of the prediction results. From the scatter distribution in Figure 13b and Figure 14b, the data points are closely clustered around the isoline, and the maximum deviation of training and testing is controlled within 15%, which further confirms the good robustness of the model. The comparative analysis with national design specifications and other machine learning algorithms (Table 4 and Table 5) shows that the COV (0.094) of the Random Forest model in this paper is significantly lower than that of the current specifications (0.178–0.223) and other methods, showing the lowest predictive variability and optimal stability. In addition, based on the independent test verification of nine CFST columns, the average between prediction and test ratio is 0.992, σ and COV are only 0.017, and the maximum error is within 5%, indicating that the model has high reliability in the actual bearing capacity inference, and its prediction uncertainty is at a very low level within the acceptable range of the project.

7. Conclusions

Through research on concrete-filled steel tubes, based on experimental data and combining existing experimental data, a prediction model for the ultimate bearing capacity of concrete-filled steel tubes was established using the RF algorithm. Experimental verification was conducted to assess the accuracy of the model, and the prediction results were evaluated based on the MSE index. The following conclusions were drawn:
(1)
The steel tube section moment of inertia (Is) contributes the most to the bearing capacity, accounting for 35.56%; other significant factors include steel tube cross-sectional area (As, 17.25%), component diameter (D, 8.74%), concrete section moment of inertia (Ic, 8.33%), column length (L, 5.01%), concrete area (Ac, 4.78%), and steel tube wall thickness (t, 3.31%), highlighting their critical importance in design considerations.
(2)
Compared with other algorithm models, the average accuracy of the improved RF model is increased by 42%. The average prediction and experimental ratios of the training set and the test set are 1.002 and 0.989, respectively. The maximum deviation is within 15%, and the mean square error is 0.0081. The improved RF model is superior to other algorithms in prediction accuracy, robustness and data fitting. Compared with the design specification, the overall error is about 1%, which proves its superior reliability in accurately and consistently predicting the ultimate bearing capacity of concrete-filled steel tubular members.
(3)
Specialized experiments further validated the accuracy and robustness of the RF model. For nine CFST column specimens varying in concrete strength, column length, slenderness ratio, and eccentricity, the maximum deviation between measured ultimate bearing capacity and RF predictions was below 5%. Moreover, predicted and experimental curves showed excellent agreement, confirming that the RF model can accurately predict the bearing capacity of CFST members under both axial and eccentric compression across a wide range of parameters.
Looking ahead, translating the proposed RF model into a practical solution for structural engineers represents a promising avenue, which we plan to actively pursue in our subsequent work. Given the model’s high accuracy and computational efficiency, exploring its deployment as a cloud-based API can be pursued to enable seamless integration into Building Information Modeling (BIM) software (Revit 2025). This will allow real-time predictions to be obtained directly within design workflows. Furthermore, developing interactive web applications can be explored to democratize access to this data-driven tool without requiring machine learning expertise. Beyond CFST structures, extending the proposed methodology—integrating comprehensive feature engineering, ensemble learning, and mechanics-interpretable feature importance—to other concrete structures characterized by nonlinear behavior and complex interactions can also be investigated. For instance, in concrete gravity dams and high-rise structures, where complex coupling effects critically affect structural integrity, applying this interpretable RF framework can be explored not only to predict structural responses but also to identify dominant influencing factors, thereby practically advancing intelligent structural design. The current stochastic forest models give deterministic prediction values, which fail to quantify the uncertainty of the prediction itself. For structural reliability analysis or probability-based design, if the prediction results lack a confidence interval or distribution information, the uncertainty quantification method can be introduced on the basis of the existing model in the subsequent work. For example, the bootstrap mechanism of Random Forest or quantile regression forest can be used to directly output the confidence interval or probability distribution of the predicted value. We can also try to link this kind of prediction with the structural reliability calculation method to estimate the bearing capacity failure probability or reliability index, so as to better serve the probability-based assessment and design.

Author Contributions

Conceptualization, W.S. and Y.C.; methodology, L.W.; software, L.W. and Y.C.; validation, W.S. and G.Z.; investigation, L.Z. and G.Z.; resources, L.W.; writing—original draft preparation, W.S. and Y.C.; writing—review and editing, L.Z., F.L. and K.X.; visualization, F.L.; project administration, W.S. and K.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Major Project of Guangxi, grant number AA23062045.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Weidi Su, Yaofei Cheng and Guangda Zhong were employed by the company Guangxi Pinglu Canal Construction Co., Ltd. Authors Li Wei and Linxiao Zhou were employed by the company Guangxi Road Construction Engineering Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Al-Attraqchi, A.Y.; Hashemi, M.J.; Al-Mahaidi, R. Hybrid Simulation of Bridges Constructed with Concrete-Filled Steel Tube Columns Subjected to Horizontal and Vertical Ground Motions. Bull. Earthq. Eng. 2020, 18, 4453–4480. [Google Scholar] [CrossRef]
  2. Han, L.-H.; Li, W.; Bjorhovde, R. Developments and Advanced Applications of Concrete-Filled Steel Tubular (CFST) Structures: Members. J. Constr. Steel Res. 2014, 100, 211–228. [Google Scholar] [CrossRef]
  3. Han, L.-H.; Yang, Y.-F.; Yang, H.; Li, W.; Wang, X.; Zhang, J. Life-cycle Based Analytical Theory of Concrete-Filled Steel Tubular Structures and Its Applications. Chin. Sci. Bull. 2020, 65, 3173–3184. (In Chinese) [Google Scholar] [CrossRef]
  4. Thai, H.-T. Beam-to-CFST Column Joints in Steel-Concrete Composite Buildings: A Comprehensive Review. Structures 2024, 68, 105737. [Google Scholar]
  5. Han, Y.; Zhao, Y.; Luo, X.; Zheng, J.-L.; Qin, D.-Y.; Feng, Z.; Luo, Y.-F. Key Construction Technologies for 600 M Rigid Arch. In Proceedings of ARCH 2023, Fuzhou, China, 25–28 October 2023; Springer: Cham, Switzerland, 2025; pp. 27–35. [Google Scholar]
  6. Han, L.H. Concrete Filled Steel Tubular Structures—Theory and Practice, 3rd ed.; Science Press: Beijing, China, 2015. (In Chinese) [Google Scholar]
  7. Ma, H.; Jia, C.; Xi, J.; Dong, J.; Zhang, X.; Zhao, Y. Cyclic Loading Test and Nonlinear Analysis on Composite Frame Consisting of Steel Reinforced Recycled Concrete Columns and Steel Beams. Eng. Struct. 2021, 241, 112480. [Google Scholar] [CrossRef]
  8. Zhang, N.; Gu, Q.; Wu, Y.; Xue, X. Refined Peridynamic Modeling of Bond-Slip Behaviors Between Ribbed Steel Rebar and Concrete in Pull-Out Tests. J. Struct. Eng. 2022, 148, 04022197. [Google Scholar]
  9. Chen, S.; Hou, C.; Zhang, H.; Han, L.-H.; Mu, T.-M. Reliability-based Evaluation for Concrete-Filled Steel Tubular (CFST) Truss under Flexural Loading. J. Constr. Steel Res. 2020, 169, 106018. [Google Scholar]
  10. Ke, X.; Ye, S.; Qin, Y.; Hu, H.-S.; Wang, B. Shear Capacity Analysis for T-shaped Steel Plate Reinforced Concrete Composite Shear Wall Based on Deformation Coordination Relationship. J. Build. Eng. 2025, 118, 115111. [Google Scholar] [CrossRef]
  11. Wang, J.; Yang, Z.; Zheng, X.; Ding, Y. Axial Compression Behavior of Square Section Concrete-Filled Steel Tubes Reinforced with Internal Latticed Steel Angles. J. Constr. Steel Res. 2023, 213, 108414. [Google Scholar] [CrossRef]
  12. Lin, S.; Zhao, Y.; Lu, Z.; Yan, X.-F. Unified Theoretical Model for Axially Loaded Concrete-Filled Steel Tube Stub Columns with Different Cross-Sectional Shapes. J. Struct. Eng. 2021, 147, 04021159. [Google Scholar]
  13. Lima, A.S.; Faria, A.R. A Unified Formulation for Composite Quasi-2D Finite Elements Based on Global-Local Superposition. Compos. Struct. 2020, 254, 112846. [Google Scholar] [CrossRef]
  14. Qin, P.; Li, X.; Yi, W. Experimental Research on Load Separation of Concrete-Filled Circular Steel Tube Short Columns. J. Constr. Steel Res. 2024, 212, 108297. [Google Scholar] [CrossRef]
  15. Le, K.B.; Cao, V.V.; Cao, H.X. Circular Concrete Filled Thin-Walled Steel Tubes under Pure Torsion: Experiments. Thin-Walled Struct. 2021, 164, 107874. [Google Scholar] [CrossRef]
  16. Akram-Ali-Hammouri, Z.; Fernandez-Delgado, M.; Cernadas, E.; Barro, S. Fast Support Vector Classification for Large-Scale Problems. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 6184–6195. [Google Scholar] [CrossRef] [PubMed]
  17. Tharwat, A. Parameter Investigation of Support Vector Machine Classifier with Kernel Functions. Knowl. Inf. Syst. 2019, 61, 1269–1302. [Google Scholar] [CrossRef]
  18. Liu, H.; Chen, M.; Er, S.; Liao, W.; Zhang, T.; Zhao, T. Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022. [Google Scholar]
  19. Dev, V.A.; Eden, M.R. Formation Lithology Classification Using Scalable Gradient Boosted Decision Trees. Comput. Chem. Eng. 2019, 128, 392–404. [Google Scholar] [CrossRef]
  20. Dong, M.; Yao, L.; Wang, X.; Benatallah, B.; Zhang, S.; Sheng, Q.Z. Gradient Boosted Neural Decision Forest. IEEE Trans. Serv. Comput. 2023, 16, 330–342. [Google Scholar] [CrossRef]
  21. Chencho, Y.; Li, J.; Hao, H.; Wang, R.; Li, L. Development and Application of Random Forest Technique for Element Level Structural Damage Quantification. Struct. Control Health Monit. 2020, 28, e2678. [Google Scholar] [CrossRef]
  22. Chen, C.; Tanaka, K.; Kotera, M.; Funatsu, K. Comparison and Improvement of the Predictability and Interpretability with Ensemble Learning Models in QSPR Applications. J. Cheminformatics 2020, 12, 1–16. [Google Scholar] [CrossRef]
  23. Rahmati, O.; Falah, F.; Naghibi, S.A.; Biggs, T.; Soltani, M.; Deo, R.C.; Cerdà, A.; Mohammadi, F.; Bui, D.T. Land Subsidence Modelling Using Tree-Based Machine Learning Algorithms. Sci. Total Environ. 2019, 672, 239–252. [Google Scholar] [CrossRef]
  24. Wang, G.; Zhao, B.; Wu, B.; Zhang, C.; Liu, W. Intelligent Prediction of Slope Stability Based on Visual Exploratory Data Analysis of 77 in Situ Cases. Int. J. Min. Sci. Technol. 2023, 33, 47–59. [Google Scholar]
  25. Alipour, M.; Harris, D.K.; Barnes, L.E.; Ozbulut, O.E.; Carroll, J. Load-Capacity Rating of Bridge Populations Through Machine Learning: Application of Decision Trees and Random Forests. J. Bridge Eng. 2017, 22, 1–17. [Google Scholar] [CrossRef]
  26. Gupta, P.; Gupta, N.; Saxena, K.K.; Goyal, S. Random Forest Modeling for Fly Ash-Calcined Clay Geopolymer Composite Strength Detection. J. Compos. Sci. 2021, 5, 271. [Google Scholar] [CrossRef]
  27. Liew, J.Y.R.; Dai, Z.; Chua, Y.S. Steel Concrete Composite Systems for Modular Construction of High-Rise Buildings. Structures 2019, 21, 135–149. [Google Scholar] [CrossRef]
  28. Kondratieva, T.N.; Chepurnenko, A.S.; Yazyev, B.M. Predicting the Strength of Eccentrically Compressed Short Circular Concrete Filled Steel Tube Columns. Struct. Mech. Eng. Constr. Build. 2025, 21, 231–241. [Google Scholar] [CrossRef]
  29. Khan, S.; Khan, M.A.; Zafar, A.; Javed, M.F.; Aslam, F.; Musarat, M.A.; Vatin, N.I. Predicting the Ultimate Axial Capacity of Uniaxially Loaded CFST Columns Using Multiphysics Artificial Intelligence. Materials 2022, 15, 1–22. [Google Scholar]
  30. Hou, C.; Zhou, X.G.; Shen, L. Intelligent Prediction Methods for N–M Interaction of CFST under Eccentric Compression. Arch. Civ. Mech. Eng. 2023, 23, 1–30. [Google Scholar] [CrossRef]
  31. Chun, P.J.; Ujike, I.; Mishima, K.; Okazaki, S. Random Forest-Based Evaluation Technique for Internal Damage in Reinforced Concrete Featuring Multiple Nondestructive Testing Results. Constr. Build. Mater. 2020, 253, 119238. [Google Scholar] [CrossRef]
  32. Zhang, J.; Ma, G.; Huang, Y.; Sun, J.; Aslani, F.; Nener, B. Modelling Uniaxial Compressive Strength of Lightweight Self-Compacting Concrete Using Random Forest Regression. Constr. Build. Mater. 2019, 210, 713–719. [Google Scholar] [CrossRef]
  33. Zhong, S.; He, R. Axial Compressive Capacity of Concrete-Filled Steel Tubular Slender Columns. J. Build. Struct. 1993, 14, 12–19.145. (In Chinese) [Google Scholar]
  34. Gu, W.; Cai, S.; Feng, W. Behavior and Load-Carrying Capacity of High-Strength Concrete-Filled Steel Tubular Columns under Eccentric Compression. Eng. Mech. 1998, 15, 45–52.147. (In Chinese) [Google Scholar]
  35. Tan, K.; Pu, X. Mechanical Behavior of High-Strength Concrete-Filled Double Skin Steel Tubular Short Columns under Axial Compression. Ind. Constr. 2002, 32, 8–13.149. (In Chinese) [Google Scholar]
  36. Zhao, J.; Gu, Q.; Ma, S. Study on Axial Compressive Capacity of Concrete-Filled Steel Tubes Based on Unified Twin-Shear Strength Theory. Eng. Mech. 2004, 21, 78–84.151. (In Chinese) [Google Scholar]
  37. Yu, Z.; Ding, F.; Lin, S. Mechanical Behavior of Concrete-Filled Steel Tube Short Columns with High-Performance Concrete. J. Harbin Inst. Technol. 2006, 38, 1453–1458.153. (In Chinese) [Google Scholar]
  38. Xiao, C.; Cai, S.; Xu, C. Experimental Study on Shear Behavior of Concrete-Filled Steel Tubes. J. Build. Struct. 1995, 16, 34–40.155. (In Chinese) [Google Scholar]
  39. Wang, L.; Qian, J. Experimental Study on Axial Compressive Capacity of High-Strength Concrete-Filled Steel Tubular Columns. J. Tsinghua Univ. (Sci. Technol.) 2003, 43, 1356–1360.156. (In Chinese) [Google Scholar]
  40. Cao, B.; Zhang, Y.; Xu, H.; Yu, H. Ultimate Bearing Capacity of Thin-Walled Concrete-Filled Steel Tubular Slender Columns. J. Build. Struct. 2005, 26, 67–73.158. (In Chinese) [Google Scholar]
  41. Yao, G. Study on the Mechanical Mechanism of Concrete-Filled Steel Tubular Members Under Complex Stress States. Ph.D. Thesis, Harbin Institute of Technology, Harbin, China, 2001; p. 160. (In Chinese) [Google Scholar]
  42. Ding, F. Structural Behavior and Design Method of Circular Concrete-Filled Steel Tubes. Ph.D. Thesis, Harbin Institute of Technology, Harbin, China, 1998. (In Chinese) [Google Scholar]
  43. Han, L.; Mou, T.; Wang, F.; Fan, B.; Li, W.; Liang, J.; Hou, C.; Ma, D.; Chen, J.; Li, C. Design Principles of Concrete-Filled Steel Tubular Hybrid Structures and Their Applications in Bridge Engineering. China Civ. Eng. J. 2010, 43, 1–10. (In Chinese) [Google Scholar]
  44. Zhao, X.; Wu, Y.; Lee, D.L.; Cui, W. Iforest: Interpreting Random Forests Via Visual Analytics. IEEE Trans. Vis. Comput. Graph. 2018, 25, 407–416. (In Chinese) [Google Scholar] [CrossRef]
  45. Aler, R.; Valls, J.M.; Bostrom, H. Study of Hellinger Distance as a Splitting Metric for Random Forests in Balanced and Imbalanced Classification Datasets. Expert Syst. Appl. 2020, 149, 113264. [Google Scholar] [CrossRef]
  46. Abou-Moustafa, K.T.; Szepesvári, C. An Exponential Tail Bound for Lq Stable Learning Rules. Application to K-Folds Cross-Validation. In Proceedings of the International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA, 7–10 January 2018. [Google Scholar]
  47. GB 50936-2014; Technical Code for Concrete Filled Steel Tubular Structures. China Architecture & Building Press: Beijing, China, 2014. (In Chinese)
  48. DBJ/T13-51-2010; Technical Specification for Concrete Filled Steel Tubular Structures. Fujian Provincial Engineering Construction Standard: Fuzhou, China, 2010. (In Chinese)
  49. ANSI/AISC 360-10; Specification for Structural Steel Buildings. American Institute of Steel Construction: Chicago, IL, USA, 2010.
  50. EN 1994-1-1:2004; Eurocode 4—Design of Composite Steel and Concrete Structures—Part 1-1: General Rules and Rules for Buildings. European Committee for Standardization: Brussels, Belgium, 2004.
  51. AIJ-SRC (1997); Standard for Structural Calculation of Steel Reinforced Concrete Structures. Architectural Institute of Japan: Tokyo, Japan, 1997.
  52. BS 5400; Steel, Concrete and Composite Bridges. British Standards Institution: London, UK, 1978–2006.
  53. GB/T 228.1-2010; Metallic Materials—Tensile Testing—Part 1: Method of Test at Room Temperature. China Standards Press: Beijing, China, 2010. (In Chinese)
  54. GB/T50081-2002; Standard for Test Method of Mechanical Properties of Ordinary Concrete. China Architecture & Building Press: Beijing, China, 2002. (In Chinese)
  55. Rasa, A.Y. A Robust Infinite Lagrangian Fluid Element for Seismic Analysis of Dams. Int. J. Struct. Stab. Dyn. 2026. [Google Scholar] [CrossRef]
  56. Yi, T.-H.; Li, H.-N.; Gu, M. Recent Research and Applications of GPS-Based Monitoring Technology for High-Rise Structures. Struct. Control Health Monit. 2013, 20, 649–670. [Google Scholar] [CrossRef]
Figure 1. Application of CFST in practical engineering.
Figure 1. Application of CFST in practical engineering.
Materials 19 02511 g001
Figure 2. Application of CFDST structures in power transmission engineering.
Figure 2. Application of CFDST structures in power transmission engineering.
Materials 19 02511 g002
Figure 3. Standardized data boxplot.
Figure 3. Standardized data boxplot.
Materials 19 02511 g003
Figure 4. Standardized data violin plot.
Figure 4. Standardized data violin plot.
Materials 19 02511 g004
Figure 5. The complete 19th decision tree.
Figure 5. The complete 19th decision tree.
Materials 19 02511 g005
Figure 6. Decision tree quantity error.
Figure 6. Decision tree quantity error.
Materials 19 02511 g006
Figure 7. Relationship between max depth and error.
Figure 7. Relationship between max depth and error.
Materials 19 02511 g007
Figure 8. Relationship between min sample leaf and error.
Figure 8. Relationship between min sample leaf and error.
Materials 19 02511 g008
Figure 9. Relationship between min samples split and error.
Figure 9. Relationship between min samples split and error.
Materials 19 02511 g009
Figure 10. Relationship between max feature and error.
Figure 10. Relationship between max feature and error.
Materials 19 02511 g010
Figure 11. Relationship between training set split ratio and error.
Figure 11. Relationship between training set split ratio and error.
Materials 19 02511 g011
Figure 12. Model characteristic importance coefficient.
Figure 12. Model characteristic importance coefficient.
Materials 19 02511 g012
Figure 13. Training set prediction of Random Forest model.
Figure 13. Training set prediction of Random Forest model.
Materials 19 02511 g013
Figure 14. Test set prediction of Random Forest model.
Figure 14. Test set prediction of Random Forest model.
Materials 19 02511 g014
Figure 15. Ratio distribution of ultimate bearing capacity between member prediction and test.
Figure 15. Ratio distribution of ultimate bearing capacity between member prediction and test.
Materials 19 02511 g015
Figure 16. Comparison of calculation results of different standards for CFST members.
Figure 16. Comparison of calculation results of different standards for CFST members.
Materials 19 02511 g016
Figure 17. Comparison of different algorithm models MSE for CFST components.
Figure 17. Comparison of different algorithm models MSE for CFST components.
Materials 19 02511 g017
Figure 18. Randomly select samples to predict the effect of each model.
Figure 18. Randomly select samples to predict the effect of each model.
Materials 19 02511 g018
Figure 19. Comparison of prediction and experimental results of other algorithms.
Figure 19. Comparison of prediction and experimental results of other algorithms.
Materials 19 02511 g019aMaterials 19 02511 g019b
Figure 20. The dimension drawings of specimen (unit: mm).
Figure 20. The dimension drawings of specimen (unit: mm).
Materials 19 02511 g020
Figure 21. Test operation condition.
Figure 21. Test operation condition.
Materials 19 02511 g021
Figure 22. Failure mode of all specimens.
Figure 22. Failure mode of all specimens.
Materials 19 02511 g022
Figure 23. Verification of prediction results of RF model test for CFST members.
Figure 23. Verification of prediction results of RF model test for CFST members.
Materials 19 02511 g023
Figure 24. Comparison of ultimate bearing capacity between five prediction models and those measured.
Figure 24. Comparison of ultimate bearing capacity between five prediction models and those measured.
Materials 19 02511 g024
Table 1. Data descriptive statistics table.
Table 1. Data descriptive statistics table.
Feature NamesCountMeanStdMin25%50%75%Max
D154140.9742.9382.60112.56129.00165.00320.00
t1543.682.111.002.893.004.6512.00
L154411.96104.39200.00340.00406.50500.00660.00
fy154339.3557.2222.70303.50338.90360.00492.80
fcu15466.8126.239.6047.8258.5084.40139.30
fc15457.6025.127.7038.2348.5074.40129.30
fck15448.1121.614.9332.0640.5762.22111.68
λ15411.931.803.301212.1012.6014.00
Ac (×103)15415.4011.805.008.9611.9419.1173.54
As1541640.841225.38279.60995.721334.061894.416883.23
Ic (×106)15429.9068.7919.9163.8211.3529.0743.04
Is (×103)1545961.8613,303.99276.881498.042643.885273.9884,335.04
Ec (×103)15437.645.1023.7734.1036.4541.3349.01
Es (×103)154205.761.16200.00206.00206.00206.00206.00
ξ1541.544.150.090.470.831.2733.21
B1541.230.041.141.201.231.241.34
C154−0.220.11−0.55−0.29−0.18−0.140.01
fscg154101.3538.2229.1171.2190.12125.86230.30
fscd15497.8139.4927.1566.9187.03122.57211.70
C11547.540.297.217.447.517.538.96
C21540.7900.770.790.790.790.79
fcc154137.6651.8842.46101.27119.45171.09280.51
fyr154267.7145.62175.79239.40267.32284.04388.73
fym154909.28483.55305.94634.65806.53971.062899.32
Nt1541711.791160.66341.90785.751495.502327.507914.00
Note: D represents the diameter of the concrete-filled steel tube; t denotes the wall thickness of the component steel tube; L signifies the length of the component column; fy stands for the yield strength of steel; fcu indicates the compressive strength of concrete cubes; fc signifies the compressive strength of concrete cylinders; fck represents the standard value of axial compressive strength of concrete; λ is the slenderness ratio of the component; Ac is the area of concrete; As is the area of steel tube; Ic is the moment of inertia of the concrete section; Is is the moment of inertia of the steel tube section; Ec is the elastic modulus of concrete; Es is the elastic modulus of steel; ξ is the confinement coefficient of the component; B and C are the influence coefficients of section shape; fscg is the combined compressive strength considering the influence coefficient of section shape; fscd is the combined compressive strength; C1 and C2 are calculation parameters related to the aspect ratio; fcc is the enhanced characteristic strength of concrete under axial load; fyr is the yield strength of steel considering the influence of C2; fym is the yield strength of steel considering the influence coefficient of concrete inside the tube; Nt is the ultimate bearing capacity of the component; the reference values for C1 and C2 related to the aspect ratio are as follows: when L/D equals 0, C1 is taken as 9.47 and C2 is taken as 0.76; when L/D equals 5, C1 is taken as 6.40 and C2 is taken as 0.80; when L/D equals 10, C1 is taken as 3.81 and C2 is taken as 0.85; when L/D equals 15, C1 is taken as 1.80 and C2 is taken as 0.9; when L/D equals 20, C1 is taken as 0.48 and C2 is taken as 0.95; and when L/D equals 25, C1 is taken as 0 and C2 is taken as 1.0.
Table 2. Test data of axial compression.
Table 2. Test data of axial compression.
Specimen NumberPredicted Value (kN) Experimental Value (kN) Specimen NumberPredicted Value (kN) Experimental Value (kN)
110409471711401108
23370289818381.4399
31930.6217619635728
47457142023542777
51773.815342127462832
668669522747778
7105810152359045927
81500163724735738
9102494425730700
10144015402631503444
11227321712710901108
123364289828770775
1368074629500624
14346.5348303099.32857
152077.6217631757760
1611171015
Table 3. Comparison of different codes.
Table 3. Comparison of different codes.
Training SetTest Set
AVG1.0020.989
σ0.0580.093
COV0.0580.094
Table 4. Results of different codes.
Table 4. Results of different codes.
StatisticsGB 50936-2014DBJ/T13-51-2010AISCEC4AIJBS5400This Paper’s RF Mode
AVG0.9620.9550.8441.0170.8870.8050.989
COV0.1860.2230.1790.1820.1780.1990.094
Table 5. Statistical table of regression model test set indicators.
Table 5. Statistical table of regression model test set indicators.
Model NameDecision TreeRandom ForestRidge RegressionKNNAdaBoostSVMBP Neural Network
MSE0.011100.008100.025710.025710.014470.028800.07264
Table 6. The materials mechanics performance of steel tube.
Table 6. The materials mechanics performance of steel tube.
Specimen NumberSpecimen Thickness (mm)Average Yield Strength fy (N/mm2)Average Yield Strain (εAVG) Ultimate Strength fu (N/mm2) Elastic Modulus Es (N/mm2)
A1–A33.832515784502.19 × 105
Table 7. The materials mechanics performance of concrete test.
Table 7. The materials mechanics performance of concrete test.
Concrete TypeSpecimen Size (mm)Specimen NumberCompressive Strength (MPa)Average 7-Day Strength (MPa)Average 28-Day Strength (MPa)
C30150 × 150 × 150C30-07-0130.6729.2-
C30-07-0228.00
C30-07-0328.89
C30-28-0136.80-36.8
C30-28-0235.51
C30-28-0338.09
C50150 × 150 × 150C50-07-0146.6744.7-
C50-07-0242.67
C50-07-0344.89
C50-28-0154.65-58.8
C50-28-0259.32
C50-28-0362.43
Table 8. Number and main parameters of specimens.
Table 8. Number and main parameters of specimens.
Specimen NumberSteel Tube Dimensions D × t × L (mm)Concrete StrengthConfinement Coefficient ξSlenderness Ratio λEccentricity e (mm) Eccentricity Ratio e/r0
A1273 × 6 × 1200C500.957 17.58200
A20.957 17.582400.3065
A30.957 17.582800.6130
B1273 × 6 × 850C500.957 12.45400
B20.957 12.454400.3065
B30.957 12.454800.6130
C1273 × 6 × 850C301.546 12.45400
C21.546 12.454400.3065
C31.546 12.454800.6130
Table 9. Test results of the specimens.
Table 9. Test results of the specimens.
Specimen NumberConcrete StrengthColumn Length (mm)/Slenderness RatioEccentricity (mm)Ultimate Bearing Capacity (kN)Displacement at Ultimate Load (mm)Moment at Ultimate Load (kN·m)
A1C501200/17.60502010.3
A24026142.7111.6
A38022964.1193.1
B1C50850/12.4051106.7
B24029804.3132.0
B38024925.5213.1
C1C30850/12.4045897.8
C24025044.0110.2
C38021618.0190.2
Table 10. Test data of axial compression for CFST.
Table 10. Test data of axial compression for CFST.
Specimen NumberA1A2A3B1B2B3C1C2C3
Predicted value4969.82561.722318.965212.22950.22442.164451.332529.042117.78
Test value502026142296511029802492458925042161
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Su, W.; Cheng, Y.; Wei, L.; Zhong, G.; Zhou, L.; Liu, F.; Xie, K. Machine Learning Prediction of the Compressive Bearing Capacity of Concrete-Filled Steel Tubes Using Random Forest. Materials 2026, 19, 2511. https://doi.org/10.3390/ma19122511

AMA Style

Su W, Cheng Y, Wei L, Zhong G, Zhou L, Liu F, Xie K. Machine Learning Prediction of the Compressive Bearing Capacity of Concrete-Filled Steel Tubes Using Random Forest. Materials. 2026; 19(12):2511. https://doi.org/10.3390/ma19122511

Chicago/Turabian Style

Su, Weidi, Yaofei Cheng, Li Wei, Guangda Zhong, Linxiao Zhou, Fei Liu, and Kaizhong Xie. 2026. "Machine Learning Prediction of the Compressive Bearing Capacity of Concrete-Filled Steel Tubes Using Random Forest" Materials 19, no. 12: 2511. https://doi.org/10.3390/ma19122511

APA Style

Su, W., Cheng, Y., Wei, L., Zhong, G., Zhou, L., Liu, F., & Xie, K. (2026). Machine Learning Prediction of the Compressive Bearing Capacity of Concrete-Filled Steel Tubes Using Random Forest. Materials, 19(12), 2511. https://doi.org/10.3390/ma19122511

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop