Next Article in Journal
Cyber-Physical Security in Smart Grids: A Comprehensive Guide to Key Research Areas, Threats, and Countermeasures
Previous Article in Journal
AI-Based Inference System for Concrete Compressive Strength: Multi-Dataset Analysis of Optimized Machine Learning Algorithms
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Technical Condition of Concrete Bridges Based on FastText+CNN

1
Guangxi Communications Investment Group Corporation Ltd., Nanning 530022, China
2
The Natural Resources Bureau of Tongle Town, Leye County, Baise City 533200, China
3
School of Highway, Chang’an University, Xi’an 710064, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(23), 12386; https://doi.org/10.3390/app152312386
Submission received: 24 May 2025 / Revised: 25 August 2025 / Accepted: 19 November 2025 / Published: 21 November 2025

Abstract

Addressing the challenges of scarce measured data for Class 3–4 bridges and strong subjectivity in manual assessments in bridge technical-condition evaluation, this study innovatively proposes a FastText+CNN evaluation model that integrates semantic features with spatial pattern recognition. By constructing a hierarchical data structure of bridge scale matrices using the analytic hierarchy process (AHP) and generating a balanced training set encompassing Class 1–5 bridges through computational code, the model overcomes the bottleneck of training under small-sample conditions. It employs N-Gram embeddings to achieve semantic representation of defect feature combinations, combines one-dimensional convolutional neural networks to capture cross-component spatial correlation patterns, and utilizes hierarchical Softmax to optimize multi-classification efficiency. Experiments show that the model achieves 92.4% accuracy on the test set, outperforming random forest and multi-layer CNN models by 15.9% and 3.7%, respectively, with recognition rates for Class 3–5 bridges rising to 85% and cross-entropy loss reduced to 0.36. Validated with data from 30 actual bridges, the model maintains 92.3% accuracy and demonstrates the ability to discover implicit patterns in cross-component defect chains, providing an intelligent solution for bridge technical condition evaluation that combines semantic understanding with spatial feature extraction.

1. Introduction

China’s highway bridges now exceed one million in total, with nearly 84% being low-grade bridges below Class 3 [1]. As service years increase, load standards rise, and material performance degrades [2,3,4], the accuracy and timeliness of bridge technical-condition evaluation have become critical for maintenance decision-making.
The current ‘Technical Condition Assessment Standard for Highway Bridges’ [5] employs the analytic hierarchy process (AHP) combined with manual inspection assessments, presenting three major issues: (1) the difference in expert experience leads to the subjectivity of the evaluation results; (2) scarce measured data for Class 3 and 4 bridges hinder data-driven model development; and (3) traditional methods for latent defects and degradation trends have limited predictive capability [6,7]. Therefore, intelligent evaluation methods are urgently needed to support scientific maintenance management.
Current solutions to mitigate human-factor impacts on bridge technical-condition evaluation mainly fall into two categories: one approach refines defect indicators [8], improves the AHP [9,10], or introduces the gray theory [11] to enhance evaluation sensitivity through standardized computational methods yet still fundamentally relies on expert experience for weighting, making it difficult to eliminate subjective influences. The other replaces human detection with emerging detection methods. V. Gattulli et al. [12] evaluated bridge defects by combining robots and computer programs, but they could only replace manual inspection of bridge defects and did not involve the final technical-condition assessment of the bridge. T. Garbowski et al. [13] proposed a non-destructive inverse analysis method using dynamic and static tests to identify stiffness and density changes in deteriorating concrete bridges. M. di Prisco et al. [14] examined the limits and reliability of on-site assessment methods and further combined visual inspection, monitoring arrays, and load tests with numerical analysis to evaluate the structural condition and service capacity of existing bridges. Previous studies mainly focused on single-bridge detection or parameter identification, relying on physical tests or visual monitoring, with strong reliability but limited scalability. This paper instead introduces NLP-inspired deep learning to mitigate data scarcity and subjectivity, achieving automated, scalable condition grading through semantic and spatial feature learning.
Based on the data-driven approach; by collecting or establishing large datasets to fit the relationships within the data, a scientifically objective evaluation of bridge technical-conditions can be achieved. Methods such as Markov chains [15] and Bayesian networks [16,17,18,19] are used to mine data patterns through machine-learning algorithms. However, due to the scarcity of real-world data samples for bridges in categories 3–4, existing machine-learning models suffer from limited generalization capabilities. Furthermore, current research has yet to achieve breakthrough solutions in areas such as feature engineering and small-sample learning mechanisms.
For the identification of the hidden diseases of bridges and the prediction of functional degradation trends, the latest research includes Crognale, M., et al. [20] who identify and predict damage to infrastructure such as bridges through four different computer vision programs. Zhu et al. [21] made a deterioration prediction model for highway bridges based on the semi-Maldives stochastic process.
The existing research system has fully verified the scientific basis of the fusion method of FastText and CNN: Haq et al. [22] achieved 99.44% accuracy in agricultural image classification through an 18-layer CNN architecture, proving that the convolutional layer could effectively capture spatial local features; Bhatt et al. [23] pointed out that CNN’s multipath design significantly improved the robustness of the model by solving the gradient disappearance problem. The sub-word embedding mechanism of FastText was confirmed by Raza et al. [24] to resolve the morphological features of protein sequences (Obama’s n-gram decomposition) and successfully solve the challenge of unknown words. The FastText-CNN-LSTM hybrid model designed by Hashmi et al. [25] achieves 99% accuracy in fake news detection. Its interpretability analysis shows that the synergy between sub-word embedding and the convolution kernel can accurately locate key semantic features. This study inherits the essence of the above methodology, regards the scale information of bridge technical condition as ‘pseudo text’, integrates shallow network FastText with a convolutional neural network (CNN), and realizes the capture of deep information of scale vectors by n-gram embedding and convolution operation. Finally, a FastText+CNN model is proposed to provide new ideas and methods for the evaluation of concrete bridge technical condition (Figure 1 is the development framework of this paper).
The remainder of this paper is organized as follows: Section 2 details the FastText+CNN architecture; Section 3 describes data generation and experimental design; Section 4 presents results and analysis; and Section 5 discusses implications and conclusions.

2. FastText+CNN Model Principles and Implementation

2.1. Basic Principles of FastText

FastText is an efficient text processing framework developed by Facebook AI Research, primarily used for text classification and word vector generation [26]. Its core idea is to significantly improve the model’s handling of low-frequency and out-of-vocabulary words by capturing local features and hierarchical classification optimization, while greatly reducing computational complexity in large-scale classification tasks [27]. The two main mechanisms within FastText are the n-gram language model and the hierarchical Softmax classifier.

2.1.1. N-Gram Language Model

N-gram is a method used by FastText to capture local features. As the pre-input processor of FastText, the combined features generated by n-gram constitute the basic raw material for FastText word embedding. In essence, n-gram is a text modeling technique that captures language rules by counting the co-occurrence frequency of adjacent N words or characters. For example, in the evaluation of bridge technical conditions, we may encounter specific professional terms such as ‘U11-3 crack width exceeds 0.3 mm and is accompanied by water seepage’. This phrase contains a specific description of the disease of the upper bearing members.
In the n-gram model, the phrase can be decomposed into multiple subsequences (i.e., n-gram). Suppose n = 3. Then, the possible 3-grams include [‘U11’, ‘11-’, ‘-3’, ‘grade 3’, ‘crack’, ‘crack’, ‘crack width’, ‘width’, ‘degree of excess’, ‘excess 0’, ‘0.3’, ‘0.3 m’, ‘3 mm’, ‘mm and’, ‘and’, ‘accompanied by’, ‘seepage’, ‘seepage’, ‘water present’, ‘phenomenon’] and so on. Each n-gram is regarded as an independent feature and processed according to its frequency of occurrence in the text. This method enables even specific phrases that have not appeared in the training data to be effectively represented by their components.
Combining the large number theorem and the Markov hypothesis, the n-gram model considers that the probability of a word is only related to the n − 1 words in front of it (the probability calculation is shown in Equation (1)). This means that when analyzing the above disease descriptions, n-grams such as ‘U11’ and ‘11-3’ can help the model understand the meaning of the entire phrase, even if the entire phrase does not appear directly in the training set. In this way, n-gram not only enhances the model’s ability to understand limited datasets, especially when dealing with the scarcity of measured data such as 3–4 bridges but also improves the ability to identify hidden disease patterns.
p w n w 1 n = p w 1 n p w 1 n 1 c o u n t w 1 n c o u n t w 1 n 1
In Equation (1): w 1 n represents a continuous word sequence w 1 , w 2 , w 3 , …, w n (e.g., 3-g: w 1 , w 2 , w 3 ); count ( w 1 n ) denotes the total occurrence count of this word sequence in the corpus; and count ( w 1 n 1 ): the occurrence count of the subsequence composed of the first n − 1 words.

2.1.2. Hierarchical Softmax

Hierarchical Softmax is FastText’s efficient classification technique for the output layer, addressing the performance bottleneck of traditional Softmax in large-scale classification tasks [28]. Its core idea organizes categories into a tree structure, transforming global probability calculations into layer-wise cumulative multiplication of path probabilities. The tree structure typically employs a Huffman Tree, assigning encoding paths based on category frequency. High-frequency categories are closer to the root node (shorter paths), while low-frequency categories reside in deeper layers (longer paths), achieving effective dimensionality reduction from n to l o g n .
An example is given to illustrate the working process of layered Softmax:
The technical condition evaluation scale of a bridge’s secondary components includes 1–4 four scale grades [5] (the description of the bridge technical condition corresponding to the evaluation scale is shown in Table 1), and the frequency of 1–4 four scale grades is 10, 5, 4, and 3 times, respectively.
Figure 2 fully demonstrates the Huffman Tree construction process, with the final encoding sequences being: Scale 1 0 , Scale 2 10 , Scale 3 111 , Scale 4 110 . Red numbers in the figure represent encoding results, while black numbers indicate the original weight distribution of each scale.
In summary, the FastText model introduces a sub-word embedding mechanism by decomposing vocabulary into finer-grained character sequence combinations (i.e., n-grams), thereby addressing traditional word vector models’ limitations in representing out-of-vocabulary words and morphologically complex languages. Simultaneously, for classification tasks with massive categories, FastText employs Hierarchical SoftMax as a replacement.

2.2. CNN Role in Feature Extraction

In the evaluation of the technical condition of the bridge in this paper, CNN acts on the scale matrix and captures local spatio-temporal patterns through convolution kernel scanning: identifying cross-component disease associations in the spatial dimension (such as the symbiotic relationship between the main beam crack and the bridge deck damage), detecting nonlinear deterioration trends in the time dimension (such as the risk signal that the annual growth rate of the crack width exceeds 10%), and fusing with the global semantic features of FastText to form a multi-level expression. For example, after n-gram decomposition, the semantic information of the term ‘U11-3 crack width exceeds 0.3 mm and is accompanied by water seepage’ is effectively embedded into the model. Then, CNN further extracts the deep mode of local disease combination on this basis. For example, CNN finds the cross-component propagation chain of ‘main beam crack → pier inclination → bearing displacement’ and dynamically adjusts the classification decision in combination with the standard weight, so that the misjudgment rate of 3–4 types of bridges is reduced and the comprehensive judgment from local anomaly to global risk is realized.

2.3. FastText+CNN Model Architecture

The FastText+CNN model architecture shown in Figure 3 follows the workflow of ‘scaling matrix textualization → n-gram embedding → lightweight convolution → fully connected classification’, achieving intelligent assessment of bridge technical conditions. Its core innovation lies in combining natural language processing techniques (FastText) with lightweight deep learning (CNN), preserving semantic relevance while enhancing local feature extraction through convolutional operations.
Explanation of the overall architecture diagram of the model:
(1) Input layer (dimension 1000, 1):
Receive the scale matrix data of the technical condition of the bridge. The inspection data for each bridge is encoded as a 1000-dimensional vector representing the scale score and disease combination characteristics of each component.
(2) The embedding layer contains four independent embedding branches (each branch dimension is 10,002 and 4000). The technical details of this layer are as follows:
N-gram feature extraction: each embedding layer performs an n-gram segmentation of the input scale matrix with different granularities to capture the co-deterioration mode of adjacent parts; Huffman coding optimization: Huffman tree is used to short-code high-frequency scales to reduce the amount of sparse data computation and improve the embedding efficiency; and output splicing: the output splicing of the four embedded branches is a matrix of (4000*4, 1), with a total dimension of 16,000 dimensions, and multi-scale semantic features are retained.
(3) One-dimensional convolutional layer: The high-dimensional features output of the embedding layer is mapped to the low-dimensional semantic space, i.e., the input (16,000, 1) → the output (8000, 1). Multiple convolutional kernels slide across the scale sequence to extract the deep patterns of local disease combinations.
(4) Fully connected layer:
Layer 1 (fc1): The high-dimensional (8000) features output by the convolutional layer are mapped to the low-dimensional (512) semantic space, and the global degradation patterns (such as uniform degradation of the whole bridge or local concentrated degradation) are extracted.
Layer 2 (fc2): The feature (512) is mapped to a five-dimensional vector (corresponding to the technical condition of the bridge of classes 1–5), and the probability distribution is output through SoftMax.
(5) Output layer (dimensions 5, 1):
Output the final classification results (categories 1–5) of the technical condition of the bridge and select the category with the highest probability as the prediction result.
Note: The output layer of the model maps the five-dimensional features of the fully connected layer (fc2) to the categorical probability distribution of the technical condition of the bridge by the SoftMax function. In order to quantify the difference between the prediction result and the real label, this paper uses the cross-entropy loss function as the optimization objective, and its mathematical form is as follows:
l o s s = 1 N n = 1 N t n log f B A x n
(6) Cross-entropy loss function: N is the total number of bridge samples participating in training and is the SoftMax activation function; the model output is converted into a probability distribution; the normalized characteristics of the input index of the nth bridge sample is the one-hot coding of the real technical condition level of the n th bridge sample (such as five types of bridges corresponding to [0,0,0,0,1]); and A and B are weight matrices, which are obtained by iterative algorithm model.

3. Bridge Technical-Condition Data Construction and Experimental Design

3.1. Dataset Preparation

In fact, samples of Class 3 and 4 bridges are extremely scarce, while Class 5 bridges are even rarer. Direct use of field measurement data would lead to severe data imbalance (e.g., Class 1–2 bridges constituting the vast majority), making it difficult for models to learn features of Class 3–5 bridges and resulting in poor prediction performance. To address this data imbalance, this study generates datasets by coding bridge technical-condition calculations based on the analytic hierarchy process from the ‘Technical Condition Assessment Standard for Highway Bridges’ [5] and the computational workflow in the literature [29]. Bridges are divided into 16 computational components and four hierarchical levels for calculation, with the detailed structure shown in Figure 4. First, the condition of bridge components is calculated based on their individual status and quantity, followed by computing the structural-level condition from component-level results, ultimately deriving the final technical condition of the bridge structure.
The dataset was generated through balanced sampling, producing simulated data for 1000 bridges (200 per class 1–5) to ensure equitable distribution. An 8:2 ratio split created training (800 bridges) and test sets (200 bridges) for model training and parameter tuning. Validation was performed using inspection report data from 30 beam bridges in Shanxi Province.

3.2. Comparative Models and Experimental Protocol

To evaluate the efficiency of the FastText+CNN model proposed in this paper, common classification algorithms—namely, the Random Forest model and CNN model—are established for comparison. The Random Forest code is implemented using the open-source framework sklearn, while the convolutional neural network and the proposed FastText+CNN model is developed using Facebook’s open-source neural network framework PyTorch 2.1.0.
Random Forest is an ensemble learning method used for solving classification and regression problems. It improves model performance and generalization capability by constructing multiple decision trees and combining them [30]. Adopting the concept of ‘collective intelligence’, Random Forest aggregates multiple weak learners to produce a strong learner [31]. In this paper, the Random Forest Classifier function in the sklearn library is used to classify the sample data and labels, the optimal parameters of the model are determined by random search in the sklearn library, then the grid search is used to accurately find the optimal parameters, and the main optimized parameters are shown in Table 2.
CNN models are widely used in a variety of classification tasks [32,33]. This paper mainly predicts the scale data and technical condition label data of bridge technical condition. Considering that the bridge defect-propagation chain usually spans three adjacent components (such as girder crack → bridge deck damage → bearing displacement), a convolution kernel of size 3 is selected to capture this mode optimally. The number of convolutional layers is from 1 to 6, and the corresponding number of fully connected layers and convolutional layers is the same. The separate CNN model architecture is shown in Figure 5.
The accuracy range of the training set and the verification set is shown in Figure 5.
From the above Figure 6a, it can be seen that the accuracy change trend of the 1–6 layer convolution model in the training set is similar, but when the convolution layer is 6 layers, the accuracy of the training set is as low as 60% or even 20%, and the change trend with the increase in the number of training rounds is dramatic. From the verification set, it can be seen that the test rate of the six-layer convolution neural network is stable at 62.5%, with almost no fluctuation. It can be explained that the training of the six-layer neural network is likely to be too simple, and there is an underfitting phenomenon. Therefore, it shows that a higher number of convolution layers does not result in higher model accuracy.
The optimization process for standalone CNN reveals that neither too few nor too many convolutional layers are ideal—insufficient layers under-extract sample features, while excessive layers cause underfitting. Thus, the proposed FastText+CNN model tentatively adopts 1–3 convolutional layers for optimization. Initial tuning focuses on n-gram quantity (excluding convolutional layers), followed by adding convolutional layers to the best-performing model for final optimization. Figure 6b confirms that one convolutional layer yields optimal results; hence, the FastText+CNN model ultimately uses one layer. Key optimization steps are illustrated in Figure 7.

4. Results and Discussion

4.1. Analysis of Three Model Training Results

Based on the three model experimental schemes in Section 3.2, the training results of the three models on the simulated dataset and test set were obtained. Performance metrics for each model were calculated based on these results, as detailed in Table 3. Analysis of Table 3 reveals the following:
According to analysis of Table 3, we can obtain the following:
(1) Model-accuracy ranking (FastText+CNN > CNN > Random Forest): The FastText+CNN model achieved 92.4% test set accuracy with a cross-entropy loss of 0.36, significantly outperforming other models. This validates the joint modeling advantages of the embedding layer (n-gram feature combination) + convolutional layer.
(2) Overfitting control: The Random Forest model showed the highest overfitting degree (+0.44), indicating its tendency to memorize ‘noise’ in training data. FastText+CNN, through generalized feature representation in the embedding layer, exhibited the lowest overfitting (+0.18), demonstrating optimal alignment between simulated data generation rules and model architecture.
(3) Feature interaction capability: FastText+CNN’s embedding layer explicitly learns feature combinations (e.g., joint effects of crack width and humidity), improving average bridge recognition by 27% for 3–5 bridge types (compared with CNN). Random Forest relies solely on feature importance ranking (e.g., X11 being most important) and cannot model nonlinear interaction effects.

4.2. Validation Results with Real-Bridge Data

Building upon the optimal model above, validation was performed using 30 sets of real-bridge data from Shanxi Province. Key calculation results are shown in Table 4 below.
Table 4’s real-bridge data validation reveals resource consumption differences among the three models when trained on an AMD5950x: Random Forest required 14 min training time with 32 GB peak memory usage due to parallel generation of multiple deep decision trees; the 3-layer CNN took 12 min with 18 GB GPU memory usage, achieving the highest efficiency through Tesla P40’s parallel computing; FastText+CNN required 13 min for training with 45 GB peak memory usage, becoming the memory bottleneck due to high-dimensional embedding matrices and fully connected layer parameters.
Through the training results of the three models in Section 4.1 and the verification of the real bridge, it can be found that the model proposed in this paper has high accuracy, high computational efficiency, and significant advantages compared with the common machine learning models. Based on the analysis of Section 4.1, we will further explore:
(1) Accuracy and Generalization Capability: Random Forest performs moderately on the training set but shows significant decline on the test set and real-world data, particularly in recognizing five types of bridges, indicating its limited modeling capability for high-dimensional discrete features and nonlinear relationships. CNN improves test set accuracy by 12. 2% through convolutional layers extracting local spatial features and yet still exhibits an 8.1% precision loss on real-world data, reflecting the impact of distribution differences between simulated and measured data on generalization. FastText+CNN effectively captures feature combination semantics (e.g., ‘crack width + humidity’ joint influence) via n-gram embedding layers, with only a 0.1% error gap between test and real-world data, validating the effectiveness of the simulated data generation method.
(2) Overfitting and Robustness: For Random Forest, high-dimensional sparse features cause individual trees to overfit noise, and despite limiting tree depth, poor generalization remains challenging to avoid. For CNN, regularization mitigates overfitting, but performance degradation persists due to local feature distribution disparities between simulated and real-world data. For FastText+CNN, embedding layer pretraining enhances model robustness to feature combinations, yielding the smallest error on real-world data [34].
(3) Feature Processing Capability: Random Forest relies on manual feature engineering and cannot automatically extract high-order feature combinations, with high memory consumption when handling high-dimensional sparse features. CNN directly processes numerical features but requires additional encoding for discrete binned features, potentially losing semantic information. FastText+CNN maps discrete features to continuous vectors via embedding layers, preserving semantic relevance [27], while 1D convolutional layers extract cross-feature-dimensional local patterns, outperforming the other two models in feature processing.
Overall, CNN strikes a balance between computational efficiency and resource usage. FastText+CNN, though memory-intensive, achieves optimal accuracy, whereas Random Forest’s practicality is limited by inefficient feature processing.

4.3. Empirical Analysis of the Stability of the Evaluation Method

In order to quantify the degree of interference of subjective factors in the evaluation process, we introduce the standard deviation index (σ) to objectively quantify the subjective fluctuations caused by human factors. The specific practice is as follows: 12 bridges out of 30 bridges in Shanxi Province are selected, and the AHP method, the AHP-TFN method [35] (triangular fuzzy number dynamic weight) and the FastText+CNN model (at this time, the model output node is changed to 1, and the linear activation function is used to output continuous scores) are used for five rounds of independent evaluation. In each round of evaluation, the same disease dataset is executed by different experts/models, and the standard deviation of each bridge sample in three ways is calculated according to the results of five evaluation scores.
Firstly, taking Yanshigou Bridge (bridge No.8), one of the 12 bridges, as an example, the basic disease situation is explained, as shown in Table 5 below:
Based on the actual bridge disease situation in Table 5 above, the average AHP score is 87.6, the average TFN-AHP score is 87.3, and the average FastText+CNN score is 81.25. Under the three scoring modes, the final bridge technical-condition evaluation of Yanshigou Bridge results in a type 2 bridge.
Similarly, based on the actual disease conditions of 12 bridges, the three models of AHP, TFN-AHP and FastText+CNN were used for five independent scores, and the standard deviation was calculated according to the five scores. The results are shown in Figure 8.
As shown in histogram 8, the standard deviation of the three evaluation modes is obtained from the five scores of the single bridge, and the AHP method > the AHP-TFN method > the FastText+CNN model. This represents five independent evaluations of the technical status of the same bridge under the measured data of the same bridge. The standard deviation obtained by the FastText+CNN model is about 0.03. The low value is due to the fact that there is no human interference in the evaluation of the whole process. Therefore, the standard difference obtained is the smallest. The only 0.03 is actually the difference from the measured data of the five independent bridges. The standard deviation of the AHP method and the AHP-TFN method is v large, which shows that in the process of five independent evaluations, each expert has different values for the conversion and weight of the measured data, which is bound to have a great relationship with human experience, which brings greater subjectivity, resulting in a large difference in the results of the five evaluations. According to the evaluation results of all 12 bridges, the variance fluctuation in the FastText+CNN model in 12 bridges, which is basically maintained at about 0.03, is very small. It can still be explained that the stability of FastText+CNN model evaluation is not affected by different bridges and human factors. The variance fluctuation in the AHP method and the AHP-TFN method in the 12 bridges cannot be ignored. If the 12 evaluation scales are similar, the standard deviation must be not much different. It shows that the evaluation of the five experts under the measured data of different bridges cannot maintain the same evaluation scale, which is also a major manifestation of human subjectivity.
In general, in the whole process of bridge technical-condition assessment, human subjectivity may occur in two parts: one is the use of manual observation in the measured data of the bridge, and the other is the evaluation calculation process. If it is artificial measured data, it is obvious that, in the above three models, it is impossible to eliminate human subjectivity in this aspect, but in the evaluation and calculation process, manual intervention can be completely avoided in the FastText+CNN model, although it cannot be avoided in the other two methods. If the data is measured by the machine, the FastText+CNN model will completely avoid human subjectivity, while the other two methods are obviously still unable to avoid it.

4.4. Discussion

In summary, Random Forest is an ensemble method based on decision trees, CNN is a convolutional neural network adept at processing spatial data, and FastText+CNN combines word embeddings with convolutional layers, making it suitable for text or sequential data. Each model has its strengths and weaknesses, as analyzed in Table 6.
Based on the above analysis of the three models and the real-bridge validation results in Section 4.2, the FastText+CNN model demonstrates superior predictive performance for complex bridge technical condition assessments compared with the other two common classification models. Its accuracy is primarily ensured by the synergistic effects of the following core components:
(1) Input feature engineering: The 36-dimensional bridge indicators (X11~X36) cover critical defect features such as cracks and settlement. Their normalization processing eliminates unit differences, preventing model bias toward high-value features (e.g., bearing displacement magnitudes being significantly larger than crack widths).
(2) Semantic modeling of n-gram embedding layer: By mapping discretized binning features to 4000-dimensional dense vectors, it explicitly learns semantic relationships of feature combinations [36], enabling the model to capture implicit patterns of ‘cross-indicator pathogenic chains.
(3) Local pattern extraction by 1D convolutional layer: The convolutional kernel slides along feature dimensions to extract local spatial patterns of adjacent indicators. Its kernel size needs to balance fine-grained feature capture with long-range dependency learning.
(4) Gradient orientation of cross-entropy loss function: Through backpropagation, the loss function drives joint optimization of the embedding matrix (A) and convolutional weights (B). Its gradient formula ensures effective gradients even in SoftMax output saturation zones (e.g., when dangerous bridge probability p 5 1 ), avoiding gradient vanishing issues inherent in traditional mean squared error.

5. Conclusions

Aiming at the key problems such as the scarcity of 3–4 types of samples and the strong subjectivity of manual evaluation in the evaluation of bridge technical conditions, this study proposes an innovative solution that integrates natural language processing and deep learning (scheme summary is shown in Figure 8). By optimizing the FastText model, after the input data is operated by the embedded layer, a convolutional layer is added to help the algorithm model extract deeper information, a model architecture suitable for bridge technical condition prediction (FastText+CNN) is obtained, and the following conclusions are achieved:
(1)
Using the analytic hierarchy process to construct bridge condition calculation codes generated balanced training datasets containing Class 1–5 bridges. This effectively resolves the insufficient generalization capability of machine-learning models caused by scarce Class 3–4 bridge data in practical engineering, providing reliable data support for infrastructure assessment under small-sample conditions.
(2)
The proposed FastText+CNN hybrid model transforms bridge-scaling matrices into text-like sequences through n-gram feature reconstruction, achieving deep feature extraction via single-layer convolutional networks [37]. This design significantly enhances latent defect pattern recognition while preserving FastText’s semantic advantages, achieving 92. 4% test accuracy—outperforming traditional Random Forest and CNN models.
(3)
Compared with traditional manual evaluation methods, the FastText+CNN model eliminates subjective biases caused by human experience variations and identifies latent defect combination patterns undetectable by manual inspection.
(4)
Engineering Practice Effectiveness: In validation tests using data from 30 actual bridges in Shanxi Province, the model achieved an accuracy rate of 92.3%, representing improvements of 7.2% and 3.1% over Random Forest and CNN models, respectively. This research establishes a novel ‘feature semanticization + lightweight deep learning’ evaluation framework, providing a technically innovative and practically viable solution for whole-life-cycle bridge maintenance.
The application of CNN tools in bridge assessment and monitoring has already been well established. In contrast, the use of FastText in this domain is more limited, since it was originally designed for general text classification rather than engineering or technical data analysis. Nevertheless, FastText offers unique strengths in the semantic modeling of bridge defect descriptions, making it a valuable complement to CNNs in addressing the linguistic and descriptive aspects of bridge condition evaluation. We acknowledge this limitation, and in our future research, we plan to conduct a more in-depth investigation of FastText, with the aim of refining its applicability and enhancing its integration into bridge-condition evaluation. The current limitation highlights the need for future research to explore FastText in greater depth, with the goal of improving its applicability and advancing its integration into the evaluation of bridge conditions.

Author Contributions

Conceptualization, X.W.; methodology, J.W. and Q.F.; software, J.W. and Z.D.; validation, J.W., Z.D. and S.L.; formal analysis, J.W., Z.D. and Q.F.; investigation, J.W. and X.W.; resources, S.L.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W., X.W. and Q.F.; visualization, J.W. and X.W.; supervision, X.W.; project administration, X.W. and S.L.; funding acquisition, X.W. and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Chinese Guangxi Key Research and Development Program, grant number 2024AB15010.

Data Availability Statement

All data, models, and code generated or used during the study appear in the submitted article.

Acknowledgments

During the preparation of this manuscript, the authors used PyTorch 2.1.0 for the purposes of establishing a learning framework, and utilized FastText and PyTorch 2.1.0, developed by Facebook, to establish the learning framework and implement the hybrid models, construct CNN (convolutional neural network) and FastText+CNN hybrid models, and accomplish model architecture design, training and validation and Origin 2023 for plotting and data visualization. The authors have reviewed and edited the output and take full responsibility for the content of this publication. The authors also acknowledge the support from relevant research projects and express gratitude to the teams involved in field load testing and data acquisition.

Conflicts of Interest

Author Shiwen Li was employed by Guangxi Communications Investment Group Corporation Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Appendix A.1. Overview of the Basic Situation of Yanshigou Bridge

The bridge is located in K5 + 937 of Lvliang Ring Expressway. The whole bridge adopts the standard of two-way, two-lane expressway. The design load is Highway-I. The total length of the bridge is 408.0 m, the net width of the bridge deck is 24.5 m, and the net height under the bridge is 25.201 m. It is a conventional bridge with reinforced concrete railings. The bridge structure is 10.40 m prefabricated prestressed concrete T-beam (continuous), with five pieces of transverse arrangement for each span, and the bearing is a rubber bearing. The appearance of the bridge is shown in Figure A1.
Figure A1. Appearance of Yanshigou bridge: (a) nameplate of Yanshigou Bridge; (b) Yanshigou bridge facade; (c) overview of area under Yanshigou bridge; (d) Yanshigou bridge deck.
Figure A1. Appearance of Yanshigou bridge: (a) nameplate of Yanshigou Bridge; (b) Yanshigou bridge facade; (c) overview of area under Yanshigou bridge; (d) Yanshigou bridge deck.
Applsci 15 12386 g0a1

Appendix A.2. Bridge Diseases

Appendix A.2.1. Bridge Deck, Guardrail, and Sidewalk Diseases

There is no obvious bad disease on the left and right deck pavement of the bridge. There is no sidewalk on the left and right sides of the bridge, and there is no obvious disease in the guardrail.
Figure A2. Disease observation of deck pavement, guardrail, and sidewalk: (a) deck pavement has no disease; (b) guardrail and sidewalk have no disease.
Figure A2. Disease observation of deck pavement, guardrail, and sidewalk: (a) deck pavement has no disease; (b) guardrail and sidewalk have no disease.
Applsci 15 12386 g0a2

Appendix A.2.2. Telescopic Device

There are problems such as anchorage zone defects, transverse cracks, and miscellaneous soil filling in the left and right widths of the bridge.
Figure A3. Disease observation of telescopic device: (a) defects in anchorage zone of expansion joint device, slight damage of concrete, cracks, and area of ≤ 10%, scale 2; (b) expansion joint miscellaneous soil filling and cracks caused unevenness, the difference is less than 1 cm, scale 2.
Figure A3. Disease observation of telescopic device: (a) defects in anchorage zone of expansion joint device, slight damage of concrete, cracks, and area of ≤ 10%, scale 2; (b) expansion joint miscellaneous soil filling and cracks caused unevenness, the difference is less than 1 cm, scale 2.
Applsci 15 12386 g0a3

Appendix A.2.3. Bridge Road Connection Part

There is no obvious disease at the connection part of the left and right spans of the bridge.

Appendix A.2.4. Diseases of Superstructure

Both the left and right main girders of the bridge have problems such as longitudinal cracks, peeling-off angles, and empty holes. See Figure A4a–c.
At the bridge deck bottom, the wet joints of the left and right decks of the bridge have problems such as honeycomb surface, peeling-off angle, water seepage, and transverse cracks. See Figure A4d–g.
The left and right diaphragms of the bridge have problems such as peeling-off angles, hollow holes, and mesh cracks. See Figure A4h–j.
The left bearing of the bridge has the phenomenon of separation, and the right bearing has problems of aging, deterioration, and cracking. See Figure A4k.
Figure A4. Disease observation of main beam: (a) longitudinal cracks at the bottom of the main girder (a small number of minor cracks in the main girder of <5 mm, the crack width is not exceeded, the single area is ≤1.0 m2, the scale is 2); (b) the spalling angle of the flange plate of the main girder (local concrete spalling, and the single area of ≤0.5 m2 scale is 2); (c) main-beam cavity (local concrete holes, and single area ≤0.5 m2 scale is 2); (d) honeycomb pockmarked surface at wet joint (there is a large area of honeycomb pockmarked surface, and the cumulative area is less than or equal to 50% of the component area, the scale is 2); (e) spalling-off angle at wet joints (local concrete spalling occurs and the single area is ≤0.5 square meters, the scale is 2); (f) wet joint seepage (joint-filler leakage and leakage area of >20% of the whole seam scale is 4); (g) wet seam mesh cracks (single area of mesh cracks of ≤20% of component area is 2); (h) diaphragm peeling-off angle (single area ≤0.5 m2 scale is 2); (i) diaphragm void hole (local concrete void and single area of ≤0.5 m2 scale is 2); (j) diaphragm mesh cracks (mesh crack single area of ≤1.0 m2, scale is 2); (k) left bearing void (void phenomenon appears, scale is 4); (l) right bearing void (void phenomenon, scale of 4); (m,n): bearing aging deterioration, cracking (surface dirt cracks, and the crack length is greater than the corresponding side length of 10%).
Figure A4. Disease observation of main beam: (a) longitudinal cracks at the bottom of the main girder (a small number of minor cracks in the main girder of <5 mm, the crack width is not exceeded, the single area is ≤1.0 m2, the scale is 2); (b) the spalling angle of the flange plate of the main girder (local concrete spalling, and the single area of ≤0.5 m2 scale is 2); (c) main-beam cavity (local concrete holes, and single area ≤0.5 m2 scale is 2); (d) honeycomb pockmarked surface at wet joint (there is a large area of honeycomb pockmarked surface, and the cumulative area is less than or equal to 50% of the component area, the scale is 2); (e) spalling-off angle at wet joints (local concrete spalling occurs and the single area is ≤0.5 square meters, the scale is 2); (f) wet joint seepage (joint-filler leakage and leakage area of >20% of the whole seam scale is 4); (g) wet seam mesh cracks (single area of mesh cracks of ≤20% of component area is 2); (h) diaphragm peeling-off angle (single area ≤0.5 m2 scale is 2); (i) diaphragm void hole (local concrete void and single area of ≤0.5 m2 scale is 2); (j) diaphragm mesh cracks (mesh crack single area of ≤1.0 m2, scale is 2); (k) left bearing void (void phenomenon appears, scale is 4); (l) right bearing void (void phenomenon, scale of 4); (m,n): bearing aging deterioration, cracking (surface dirt cracks, and the crack length is greater than the corresponding side length of 10%).
Applsci 15 12386 g0a4aApplsci 15 12386 g0a4b

Appendix A.2.5. Diseases of Substructure

The piers and caps of the left and right piers of the bridge are basically intact. Caps are basically intact.
Figure A5. Disease observation of piers and caps: (a) pier body is basically intact; (b) caps are basically intact.
Figure A5. Disease observation of piers and caps: (a) pier body is basically intact; (b) caps are basically intact.
Applsci 15 12386 g0a5

Appendix A.2.6. Diseases of Subsidiary Structural

Drainage system. The left and right discharge pipes and water diversion troughs of the bridge have defects. Fewer discharge pipes, diversion tanks, and drainage holes are blocked; the number is ≤5%, and the scale is 2.
Figure A6. Disease observation of drainpipe and diversion tank.
Figure A6. Disease observation of drainpipe and diversion tank.
Applsci 15 12386 g0a6
Riverbed and regulating structures have no riverbed and regulating structures.
The wing wall and ear wall of the bridge have no obvious diseases. See Figure A7.
The bridge cone slope and slope protection have no obvious diseases.
Figure A7. Wing wall and ear wall are basically intact.
Figure A7. Wing wall and ear wall are basically intact.
Applsci 15 12386 g0a7
Figure A8. Cone slope and slope protection are basically intact.
Figure A8. Cone slope and slope protection are basically intact.
Applsci 15 12386 g0a8
Lighting, signs, and other ancillary facilities. There is no lighting system on the left and right sides of the bridge (a large number of lighting facilities are missing, the number of >20% is scale 4), and there are no obvious adverse diseases in other ancillary facilities such as bridge signs.

References

  1. Ministry of Transport of the People’s Republic of China. 2022 Statistical Report on the Development of the Transport Industry. China Transportation News, 16 June 2023; p. 002. [Google Scholar] [CrossRef]
  2. Bao, Y.Q.; Li, H. Artificial Intelligence for Civil Engineering. China Civ. Eng. J. 2020, 53, 25–31. [Google Scholar] [CrossRef]
  3. Jing, Q.; Zheng, S.C.; Liang, P.; Wang, J.F. Technologies and Engineering Practices of Intelligent Operation and Maintenance of Hong Kong-Zhuhai-Macao Bridge. China J. Highway. Transp. 2023, 36, 143–156. [Google Scholar] [CrossRef]
  4. Zhao, R.; Tian, Z.; Song, Y. Research on safety risk intelligent assessment model of in-service small and medium-span bridges based on BWM+BP neural network. World Bridge 2025, 5, 97–104. Available online: http://kns.cnki.net/kcms/detail/42.1681.U.20250609.0916.022.html (accessed on 30 September 2025).
  5. JTG/T H21-2011; Technical Condition Assessment Standard for Highway Bridges. Ministry of Transport of the People’s Republic of China: Beijing, China, 2011.
  6. Zhang, J.Q.; Liu, Y.; Shen, Q.; Bi, S.S.; Wen, J.J. Analysis on Bridge Technical Condition Evaluation Methods of USA, Japan and UK. J. Highway. Transp. Res. Dev. 2022, 39, 75–83. [Google Scholar]
  7. Zong, J.; Wu, Z.; Wang, J.; Liang, S.; Zhang, Y.; Wang, S. Research on culvert joint types and joint treatment measures. Sci. Technol. Innov. Appl. 2025, 15, 133–137+141. [Google Scholar] [CrossRef]
  8. Liu, J.; Ding, J.; Song, R.R.; He, S.L. Performance Assessment for the Existing Bridge Based on Design Performance Status. J. Shihezi Univ. (Nat. Sci. Ed.) 2021, 39, 307–312. [Google Scholar] [CrossRef]
  9. Liu, H.G.; Feng, L. Research on Bridge Technical Condition Evaluation Based on Time Series Data and Grey Theory. Technol. Highway. Transp. 2021, 37, 108–114. [Google Scholar] [CrossRef]
  10. Lu, X.; Wei, K.; Tang, X.; Hu, Z. Functional recovery model and resilience assessment of bridge after earthquake considering component repair sequence. China J. Highw. Transp. 2025, 38, 49–60. Available online: http://kns.cnki.net/kcms/detail/61.1313.U.20250421.1532.002.html (accessed on 22 April 2025).
  11. Guo, X.; Xu, K.; Shi, X.; Sun, E. Safety evaluation of bridge state based on set-valued statistics-gray fuzzy theory. Ind. Saf. Environ. Prot. 2020, 46, 31–35. [Google Scholar]
  12. Potenza, F.; Rinaldi, C.; Ottaviano, E.; Gattulli, V. A robotics and computer-aided procedure for defect evaluation in bridge inspection. J. Civ. Struct. Health Monit. 2020, 10, 471–484. [Google Scholar] [CrossRef]
  13. Garbowski, T.; Cornaggia, A.; Zaborowicz, M.; Sowa, S. Computer-Aided Structural Diagnosis of Bridges Using Combinations of Static and Dynamic Tests: A Preliminary Investigation. Materials 2023, 16, 7512. [Google Scholar] [CrossRef]
  14. Di Prisco, M.; Scola, M.; Zani, G. On site assessment of Azzone Visconti bridge in Lecco: Limits and reliability of current techniques. Constr. Build. Mater. 2019, 209, 269–282. [Google Scholar] [CrossRef]
  15. Zhang, Y.; Huang, Y.Y.; Ren, C.F.; He, T.; Ma, X.; Liang, P. Multi-stage degradation model of bridge technical condition based on inspection and evaluation big data. Highway 2018, 63, 87–91. [Google Scholar]
  16. Qiao, P.; Liang, Z.Q.; Xu, K.; Zhong, C.; Qin, F. Evaluation of Technical Condition of Medium and Small-Span Bridge Based on Machine Learning. J. Chang’an Univ. (Nat. Sci. Ed.) 2021, 41, 39–52. [Google Scholar] [CrossRef]
  17. Wang, L.; Chen, R.; Dai, L.; Tu, Y. Multi-objective Maintenance Optimization Strategy for Bridge Networks Oriented to Low-carbon. China J. Highw. Transp. 2024, 37, 188–200. [Google Scholar] [CrossRef]
  18. Gao, Z.; Gao, L. Bridge resistance degradation model and reliability prediction based on inverse Gaussian process and Bayesian update. Hunan Transp. Sci. Technol. 2025, 51, 183–187. Available online: http://kns.cnki.net/kcms/detail/43.1193.U.20250617.1120.002.html (accessed on 30 September 2025).
  19. Zhong, Z.; Tang, C. A Framework for Intelligent Management and Maintenance of Bridges Based on Big Data. J. Civ. Eng. 2025, 58, 69–79. [Google Scholar] [CrossRef]
  20. Crognale, M.; De Iuliis, M.; Rinaldi, C.; Gattulli, V. Damage detection with image processing: A comparative study. Earthq. Eng. Eng. Vib. 2023, 22, 333–345. [Google Scholar] [CrossRef]
  21. Zhu, Z.; Ye, K.; Yu, X.; Lin, Z.; Xu, G.; Guo, Z.; Lu, S.; Nie, B.; Chen, H. State-Based Technical Condition Assessment and Prediction of Concrete Box Girder Bridges. Buildings 2024, 14, 543. [Google Scholar] [CrossRef]
  22. Haq, M.A. CNN Based Automated Weed Detection System Using UAV Imagery. Comput. Syst. Sci. Eng. 2022, 42, 837–849. [Google Scholar] [CrossRef]
  23. Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
  24. Raza, A.; Uddin, J.; Almuhaimeed, A.; Akbar, S.; Zou, Q.; Ahmad, A. Aips-sntcn: Predicting anti-inflammatory peptides using fasttext and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks. J. Chem. Inf. Model. 2023, 63, 6537–6554. [Google Scholar] [CrossRef]
  25. Hashmi, E.; Yayilgan, S.Y.; Yamin, M.M.; Ali, S.; Abomhara, M. Advancing Fake News Detection: Hybrid Deep Learning with FastText and Explainable AI. IEEE Access 2024, 12, 44462–44480. [Google Scholar] [CrossRef]
  26. Choi, J.K.; Lee, S.W. Improving FastText with Inverse Document Frequency of Subwords. Pattern Recognit. Lett. 2020, 133, 165–172. [Google Scholar] [CrossRef]
  27. Kralicek, J.; Matas, J. Fast Text vs. Non-Text Classification of Images. In Computer Vision–ACCV 2020 Workshops; Springer: Cham, Switzerland, 2021; pp. 20–35. [Google Scholar] [CrossRef]
  28. Noor, N.S.; Hammood, D.A.; Al-Naji, A.; Chahl, J.A. A Fast Text-to-Image Encryption-Decryption Algorithm for Secure Network Communication. Computers 2022, 11, 39. [Google Scholar] [CrossRef]
  29. Cheng, Y.; Xia, Q.Y. An analysis of the general technical condition of bridges during their service life Rating. Highway 2010, 220–223. [Google Scholar]
  30. dos Santos, A.P.; da Silva Junior, A.X.; Nery, L.M.; Gomes, G.; Toniolo, B.P.; da-Cunha-e-Silva, D.C.; Lourenço, R.W. Random Forest Algorithm Applied to Model Soil Textural Classification in a River Basin. Environ. Monit. Assess. 2025, 197, 13786. [Google Scholar] [CrossRef]
  31. Xu, Q.; Yang, F.; Hu, S.; He, X.; Hong, Y. Tree Height-Diameter Model of Natural Coniferous and Broad-Leaved Mixed Forests Based on Random Forest Method and Nonlinear Mixed-Effects Method in Jilin Province. Forests 2024, 15, 1922. [Google Scholar] [CrossRef]
  32. Huang, R.; Ding, J.; Ren, Z.; Li, Y.; Lai, X.; Chen, J.; You, H. Research on the Segmentation of Individual Trees and the Extraction of Structural Parameters in Eucalyptus Plantations Based on a TEMA Mask R-CNN Model. Remote Sens. 2025, in press. [Google Scholar] [CrossRef]
  33. Jing, H.; Tian, S.; Tian, G.; Yan, Y.; Wei, X. Early Prediction of Battery Lifetime for Lithium-Ion Batteries Based on a Hybrid Clustered CNN Model. Energy 2025, 319, 134992. [Google Scholar] [CrossRef]
  34. Mestry, S.; Singh, H.; Chauhan, R.; Bisht, V.; Tiwari, K. Automation in Social Networking Comments with the Help of Robust fastText and CNN. In Proceedings of the 2020 1st International Conference on Innovations in Information and Communication Technology (ICIICT), Chennai, India, 25–26 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4. [Google Scholar] [CrossRef]
  35. Wang, J.; Wu, X.; Chen, S.; Qin, H.; Zhang, Y. Technical condition assessment of concrete girder bridges based on AHP-TFN. J. Chang. Polytech. 2023, 40, 9–13+68. [Google Scholar] [CrossRef]
  36. Khasanah, I.N. Sentiment Classification Using fastText Embedding and Deep Learning Model. Procedia Comput. Sci. 2021, 189, 343–350. [Google Scholar] [CrossRef]
  37. Haffar, N.; Zrigui, M. A Synergistic Bidirectional LSTM and N-gram Multi-channel CNN Approach Based on BERT and FastText for Arabic Event Identification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2023, 22, 27. [Google Scholar] [CrossRef]
Figure 1. Article structure.
Figure 1. Article structure.
Applsci 15 12386 g001
Figure 2. Huffman coding process.
Figure 2. Huffman coding process.
Applsci 15 12386 g002
Figure 3. Schematic diagram of model architecture.
Figure 3. Schematic diagram of model architecture.
Applsci 15 12386 g003
Figure 4. Schematic structure of the technical condition of the bridge.
Figure 4. Schematic structure of the technical condition of the bridge.
Applsci 15 12386 g004
Figure 5. CNN model architecture diagram.
Figure 5. CNN model architecture diagram.
Applsci 15 12386 g005
Figure 6. Schematic representation of the accuracy of convolutional neural network.
Figure 6. Schematic representation of the accuracy of convolutional neural network.
Applsci 15 12386 g006
Figure 7. Schematic diagram of model accuracy.
Figure 7. Schematic diagram of model accuracy.
Applsci 15 12386 g007
Figure 8. Evaluation method stability spatial distribution.
Figure 8. Evaluation method stability spatial distribution.
Applsci 15 12386 g008
Table 1. Evaluation scale of technical condition of bridge secondary components.
Table 1. Evaluation scale of technical condition of bridge secondary components.
Technical Condition Assessment ScaleDescription of Bridge Technical Conditions
1 typeNew state, function in good condition or good function, materials have mild defects, pollution, etc.
2 typesMedium defect or contamination
3 typesThe material has serious defects, and the function is reduced. Further deterioration will not be conducive to the main components and affect normal traffic.
4 typesMaterials have serious defects, loss of function should be reduced, seriously affect normal traffic or there is no original setting, and the survey needs to be supplemented.
Table 2. Main parameters of the Random Forest algorithm model.
Table 2. Main parameters of the Random Forest algorithm model.
ParameterOptimized Scope
Number of decision trees(103,000)
Maximum depth of decision tree(10,100)
The minimum number of samples required for internal node splitting(1100)
The minimum number of samples required for leaf nodes(1100)
Candidate ways of feature selection[‘sqrt’, ‘log2’],
Whether to enable self-sampling[True, False]
The standard of segmentation features[‘gini’, ‘entropy’]
Table 3. Training results and comparison of simulated data.
Table 3. Training results and comparison of simulated data.
Evaluation IndexRandom ForestCNN (Three Layers)FastText+CNN
Training set accuracy89.2%95.1%97.8%
Accuracy of test set76.5%88.7%92.4%
Cross-entropy loss (training set Δ 1 )0. 630.320.18
Cross-entropy loss (test set Δ 2 )1.070.550.36
Over-fitting degree ( Δ l o s s   =   Δ 1     Δ 2 )+0.44+0.23+0.18
Average identification of 3–5 types of bridges0.530.740.85
Table 4. Model comparison results.
Table 4. Model comparison results.
ModelTest Set Accuracy/%Training Time/minPeak Memory
FastText+CNN92.31345 GB
Random Forest85.11432 GB
CNN (three layers)89.21218 GB
Table 5. Basic diseases of Yanshigou Bridge (more details can be found in Appendix A).
Table 5. Basic diseases of Yanshigou Bridge (more details can be found in Appendix A).
Floor System Diseases
A Bridge deck pavement: no obvious adverse disease
B Road and bridge connection part: no obvious bad disease
C Guardrails and sidewalks: no obvious adverse disease
D Expansion device: expansion joints have problems such as anchorage zone defects, transverse cracks, miscellaneous soil filling, and so on
Applsci 15 12386 i001Applsci 15 12386 i002
Cracks and spalling areas of concrete in anchorage zone is less than 10%.Expansion joint miscellaneous soil filling and cracks caused unevenness; the difference is less than 1 cm.
Upper structure disease
A Main beam: There are problems such as longitudinal cracks, peeling off angles, and empty holes.
Applsci 15 12386 i003
Longitudinal crack width < 5 mm.
Applsci 15 12386 i004
Flaking off angle, area < 0.5 m2.
Applsci 15 12386 i005
Holes and holes.
B Bridge deck: There are problems such as honeycomb surface, peeling-off angle, water seepage, and transverse cracks in wet joints.
C Diaphragm: There are problems such as peeling-off corners, empty holes, and mesh cracks.
D Bearing: There is a void phenomenon, and the right bearing has problems such as aging, deterioration, and cracking.
Substructure disease
A Pier or abutment: pier and cap are basically intact
B Pier foundation: The pier foundation has no obvious disease.
Subsidiary structural disease
A Drainage system: water pipe, water tank: flawed
B Wing wall, ear wall: no obvious disease
C Cone slope, slope protection: no obvious disease
D Other Auxiliary Facilities: no obvious adverse disease
Table 6. Comparison of model advantages and disadvantages.
Table 6. Comparison of model advantages and disadvantages.
ModelAdvantageInferiority
Random
Forest
(1) No feature standardization; (2) strong anti-overfitting ability; (3) high interpretability (feature importance ranking).(1) It is difficult to capture the nonlinear relationship between features; (2) high-dimensional sparse data is inefficient; (3) cross-entropy loss cannot be directly optimized.
CNN (three layers)(1) Automatic extraction of local features; (2) end-to-end training support; (3) gradient stability when optimizing cross-entropy loss.(1) Input needs fixed dimensions; (2) hyper-parameter sensitivity (convolution kernel size, number of layers); (3) ability of feature combination modeling is limited.
FastText
+CNN
(1) Explicit modeling feature combination (N-Gram embedding); (2) joint optimization of cross-entropy loss to improve minority class recognition; (3) adaptation to high-dimensional sparse input.(1) High computational complexity (large number of embedded matrix parameters); (2) embedded layer needs to be pre-trained; (3) large computing resource consumption.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Deng, Z.; Wang, J.; Wu, X.; Feng, Q. Research on Technical Condition of Concrete Bridges Based on FastText+CNN. Appl. Sci. 2025, 15, 12386. https://doi.org/10.3390/app152312386

AMA Style

Li S, Deng Z, Wang J, Wu X, Feng Q. Research on Technical Condition of Concrete Bridges Based on FastText+CNN. Applied Sciences. 2025; 15(23):12386. https://doi.org/10.3390/app152312386

Chicago/Turabian Style

Li, Shiwen, Zhihai Deng, Junguang Wang, Xiaoguang Wu, and Qingyuan Feng. 2025. "Research on Technical Condition of Concrete Bridges Based on FastText+CNN" Applied Sciences 15, no. 23: 12386. https://doi.org/10.3390/app152312386

APA Style

Li, S., Deng, Z., Wang, J., Wu, X., & Feng, Q. (2025). Research on Technical Condition of Concrete Bridges Based on FastText+CNN. Applied Sciences, 15(23), 12386. https://doi.org/10.3390/app152312386

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop