Structural Component Identification and Damage Localization of Civil Infrastructure Using Semantic Segmentation
Abstract
1. Introduction
2. The Proposed Methodology for Semantic Segmentation of Civil Infrastructure Images
2.1. Overview of the Proposed Methodology for the Training Process
2.2. Data Set Description
2.3. Training Protocol and Learning Rate Schedule
- ModelCheckpoint: During training, the model was monitored using the validation loss. The best-performing model (with the lowest validation loss) was saved to disk. This ensured that the final model used for evaluation did not suffer from overfitting or suboptimal convergence due to later epochs.
- ReduceLROnPlateau: To dynamically adapt the learning rate during training, we used a reduction-on-plateau strategy. If the validation loss did not improve for 5 consecutive epochs, the learning rate was reduced by a factor of 0.5. This allowed the optimizer to take smaller steps during later training stages, which helped stabilize convergence and fine-tune the weights.
- EarlyStopping: Training was halted if the validation loss did not improve for 10 consecutive epochs. The weights from the best epoch (according to validation loss) were automatically restored, ensuring that the final model did not overfit.
2.4. Image Augmentation
- (a)
- Original image. In formal mathematical language, the usage of the original image is equivalent to identity mapping between the input and output images:
- (b)
- Brightness Adjustment. A brightness shift adds a scalar bias to each pixel:Here, limits the value to the range .
- (c)
- Contrast Adjustment. Contrast is adjusted by scaling the deviation from the mean intensity :
- (d)
- Gamma Correction. Gamma correction applies a nonlinear transformation:The parameter controls the shape of the correction curve, where for , the image appears brighter, while for , the image appears darker.
- (e)
- Noise injection. Noise injection introduces random perturbations, making the model more resilient to sensor noise, especially in shadows. Gaussian (normal) noise was used for each image in the input batch.Augmentation function generates pixel-wise noise drawn from a Gaussian distribution with a mean value 0 and variable standard deviation given in Table 1.
- (f)
- Flipping. Flipping generates different perspectives of the viaduct, which helps the model learn orientation-independent features. In our application, horizontal flipping was used
- (g)
- Rotation. Rotation transformation can be described as follows:
- (h)
- CutMix technique. CutMix, a technique that replaces a region of an image with an image taken from another sample (see Figure 5h). It helps in learning more discriminative features by forcing the model to focus on multiple context regions. In the CutMix augmentation strategy, a rectangular region from one training image is copied and pasted into a target image . The pasted region is defined by its top-left corner coordinates and size . The target image and corresponding label are modified as follows:
Augmentation | Parameters | Probability [%] |
---|---|---|
Brightness | 40 | |
Contrast | 40 | |
Gamma | 40 | |
Noise | 40 | |
Rotation | 60 | |
Flip | N/A | 50 |
CutMix | position , size | 40 |
- Brightness, Contrast, Gamma, and Noise (40%): These augmentations simulate varying image acquisition conditions, such as lighting inconsistencies and sensor noise, which are common in UAV-based inspections. A probability of 40% offers a balanced trade-off—frequent enough to improve model generalization but not so dominant as to degrade data fidelity. This rate is consistent with augmentation strategies adopted in related deep learning studies on civil infrastructure monitoring [14,15].
- Flip (50%): Horizontal flipping is applied with a probability of 50%, a standard setting in many vision-based learning pipelines. This is particularly relevant for viaduct imagery, which often exhibits axial symmetry. A 50% rate introduces orientation variation while preserving the structural coherence of the scene.
- Rotation (60%): A slightly higher probability was chosen for rotation to reflect the diversity of camera angles typically encountered in UAV inspections. The selected range of corresponds to realistic off-axis views without introducing distortions. A 60% probability ensures adequate rotational diversity, which was empirically shown to enhance performance in both segmentation tasks.
- CutMix (40%): CutMix introduces strong contextual perturbations by blending regions from different images. While this encourages the model to learn more robust features, excessive use can reduce semantic coherence, particularly in structured scenes like viaducts. Hence, a conservative value of 40% was adopted.
2.5. Architectures of Neural Networks for the Semantic Segmentation Problem
2.5.1. U-Net Architecture with Optional Attention and Customizations
- Input and output. The network accepts input images of arbitrary spatial dimensions, specified by the input_shape parameter, and produces a segmentation mask with either a single channel (sigmoid activation for binary segmentation) or multiple channels (softmax for multi-class segmentation), controlled by the num_classes and output_activation arguments.Encoder (Contracting Path). The encoder consists of a configurable number of levels (num_layers). At each level:
- -
- A double convolution block is applied.
- -
- Each convolution block may include optional batch normalization, spatial or standard dropout, and ReLU activation (or other activation functions).
- -
- After the convolution block, max pooling is applied to downsample the feature maps by a factor of 2.
- -
- The number of filters starts at a base value (filters, typically 16) and doubles at each subsequent level.
- -
- The dropout rate can increase with each layer to gradually regularize deeper layers.
- Bottleneck. After the encoder, a central convolution block (the bottleneck) captures high-level features. This block uses the highest number of filters and the final dropout level before upsampling begins.
- Decoder (Expanding Path). The decoder path reconstructs the segmentation mask through upsampling: Each upsampling step uses either transposed convolution (Conv2DTranspose) or nearest-neighbor upsampling followed by convolution, as selected by upsample_mode (deconv or simple). The feature maps from the corresponding encoder level are concatenated via skip connections to retain spatial details. Optionally, attention gates can be applied to modulate skip connections. These gates compute attention coefficients based on both the encoder and decoder features, enhancing relevant spatial regions and suppressing irrelevant ones. A convolution block follows each concatenation to refine the fused features.
- Output layer Final 1 × 1 convolution reduces the number of output channels to num_classes, and the activation function (sigmoid or softmax) produces the pixel-wise class probabilities.
- Attention Mechanism. The optional attention gate mechanism follows the additive attention formulation. It uses 1 × 1 convolutions on both the decoder input and the encoder skip connection to compute intermediate features, which are added and passed through ReLU and sigmoid activations to produce an attention mask. This mask is then applied multiplicatively to the skip connection before concatenation, effectively guiding the model to focus on relevant spatial regions.
2.5.2. DeepLabV3+ Network Architecture
- Backbone Feature Extractor. The encoder utilizes a ResNet101V2 backbone pretrained on ImageNet and excludes the top classification layers. Intermediate feature maps are extracted from
- -
- conv4_block6_2_relu (deep feature map) for the context module;
- -
- conv2_block3_2_relu (early feature map) for spatial detail recovery in the decoder.
- Atrous Spatial Pyramid Pooling (ASPP). We adopt a modified ASPP block that applies convolutions with different dilation rates to capture multi-scale contextual information. Specifically, this module includes global average pooling followed by a convolution and upsampling; parallel and convolutions with dilation rates of 4, 6, 12, and 18; concatenation of all branches; and a final convolution to aggregate features. An extended variant DilatedSpatialPyramidPoolingD4 is also defined and tested, supporting finer granularity via additional dilation rates (e.g., 4, 8, …, 24), but it is not used in the main model function due to the minor impact of the 24 dilation rate observed.
- Decoder and Upsampling. To refine segmentation boundaries, the decoder combines the ASPP output with high-resolution features from the early encoder stage. The decoder consists of a transposed convolution applied to the ASPP output (upsampling by a factor of 2), a convolution applied to the early feature map for dimension alignment, the concatenation of both feature streams, two convolutional refinement blocks, and further upsampling using transposed convolutions (with strides of 4 and 2) and interleaved with ReLU-activated convolutions. The final prediction layer is a convolution with num_classes output channels and the specified activation function (softmax or other).
3. Loss Function and Evaluation Metrics
- The input to the models is shown as 320 × 160 × 3 data. The last dimension in the input represents the three color channels: red (R), green (G), and blue (B).
- The output of the models for this task is represented as 320 × 160 × 4 with four classes.
- The Figure 9 also indicates that the segmentation process distinguishes between different structural components and the background, assigning pixels to specific classes. For this task, the following four classes are detected: Background, Slab, Beam, and Column.
- The input to the models is also shown as 320 × 160 × 3 data. This input again corresponds to images with three color channels: red (R), green (G), and blue (B).
- For this specific task, the output of the models is represented as 320 × 160 × 3 with three classes.
- The figure indicates that the segmentation aims to identify and differentiate types of damage and the background. For the damage detection task, the following three classes are identified: Background, Concrete spalling, and Reinforcement exposure.
4. Comparison of the Results Obtained Using Both Architectures
4.1. Results for Task 1: Structural Component Identification
- Not Pretrained U-Net Model: For the U-Net model without pretrained weights, the mIoU values for structural segmentation also fell within the range of 50% to 95%, using augmentations including Original, Brightness, Contrast, Gamma, Noise, Flip, Rotation, and All (Figure 12).
- Not Pretrained DeepLabV3+ Model: The version of DeepLabV3+ trained without initial weights from a large external dataset showed mIoU values in the range of 50% to 95% across the same set of augmentations (None, Brightness, Contrast, Gamma, Noise, Flip, Rotation, All) (Figure 13).
- Pretrained DeepLabV3+ Model: When subjected to various augmentations (None, Brightness, Contrast, Gamma, Noise, Flip, Rotation, All), this model achieved mIoU values ranging from 50% to 100% (Figure 14).
Differences Between Pretrained and Not Pretrained Models
- Structural parts identification: The pretrained DeepLabV3+ model demonstrated the potential to reach higher maximum mIoU values (up to 100%) (Figure 14) compared to the not pretrained DeepLabV3+ model (up to 95%) (Figure 13). This suggests that leveraging knowledge from a prior, potentially larger, dataset through pretraining can enhance the model’s capability to accurately segment structural elements, especially in achieving peak performance.
- Concrete damage detection: Both the pretrained DeepLabV3+ model (range 0–35%) (Figure 15) and the not pretrained DeepLabV3+ model (range 0–35%) (Figure 16) exhibited a similar range of mIoU values. The relatively low mIoU values observed for damage detection across all models (maximum 35% for DeepLabV3+ (Figure 15 and Figure 16) and 7% for U-Net (Figure 17)) indicate that identifying damage is a considerably more challenging task than segmenting the main structural components. In this specific task, pretraining the DeepLabV3+ model did not appear to offer a distinct advantage over training from scratch, based on the range of results presented.
4.2. Results for Task 2: Structural Damage Detection
- Pretrained DeepLabV3+ Model: When trained with standard categorical cross-entropy loss, the pretrained DeepLabV3+ model produced low mIoU values for damage classes, ranging-from 0% to 35%, depending on the augmentation used (see Figure 18). Performance on the reinforcement class was especially poor, with many predictions missing entirely.
- Trained-from-scratch DeepLabV3+ Model: Similarly, the DeepLabV3+ model without pretrained weights exhibited comparable mIoU values (0–35%), with minimal improvements under specific augmentations (Figure 16).
- Trained-from-scratch U-Net Model (Baseline): Initially, the U-Net model trained from scratch using categorical cross-entropy achieved the lowest mIoU values among all tested architectures—typically in the range of 0% to 7% across augmentations (Figure 17).
- Improved U-Net Model with Weighted Focal Tversky Loss function: After implementing a weighted focal Tversky loss function with recall-favoring hyperparameters (, , ) and class weights to handle severe class imbalance, the U-Net model achieved a substantial performance boost (Table 2 and Table 3). The best configuration reached an IoU of 48 for cracks and 44 for reinforcement, with corresponding F1 scores of 65 and 61, respectively. This demonstrates the importance of loss adaptation for fine damage segmentation. Background segmentation accuracy remained high (IoU = 98), confirming that foreground detection was improved without sacrificing overall stability.
- Improved DeepLabV3+ Model with Weighted Tversky Loss function: To evaluate the impact of loss function choice on damage segmentation performance, we compared the commonly used categorical cross-entropy (CCE) loss with a weighted focal Tversky loss formulation. The results of both configurations were computed on the same test set using the same U-Net architecture and are presented in Table 4.The results indicate that both loss functions yield similarly high performance for the background class (IoU ≈ 0.97–0.98). However, substantial differences were observed for the damage classes:
- -
- For cracks, the weighted Tversky loss improved the IoU from 0.20 to 0.42 and increased the F1 score from 0.39 to 0.59. This reflects a better balance between precision and recall, which is especially important for detecting small and thin regions.
- -
- For reinforcement, the Tversky-based configuration significantly outperformed CCE in all metrics, improving IoU from 0.12 to 0.38 and F1 from 0.29 to 0.55.
These improvements are attributed to the Tversky loss’s ability to penalize false negatives more heavily, which aligns with the safety-critical nature of structural damage detection, where missing a damaged area is more critical than false alarms. Furthermore, class imbalance was explicitly addressed through weighting, allowing the network to better learn underrepresented classes. The weighted Tversky loss demonstrates a clear advantage over categorical cross-entropy for fine-grained segmentation tasks involving small and imbalanced damage regions and was therefore adopted in our final model configuration. A visual comparison between U-Net results and DeepLabV3+ with the application of Tversky loss function can be found in Figure 19.
4.3. Train and Validation Evaluation for the U-Net Model
- Background class performance is highly stable for both loss functions (IoU consistently ≈ 98%), showing the model’s ability to accurately segment dominant classes without overfitting.
- Cracks: For categorical cross-entropy, there is a notable gap between train (39.4%) and test (32.1%) IoU, suggesting mild overfitting. In contrast, the Tversky loss achieves higher and more balanced performance across all splits, even slightly outperforming train IoU on test data (47.9% test vs. 35.9% train), indicating robust generalization to unseen samples.
- Reinforcement: Under categorical cross-entropy, segmentation nearly collapses across all splits (test IoU: 11.1%, train: 8.6%). This class suffers from extreme imbalance and sparsity. The Tversky loss, in contrast, leads to a significant increase in reinforcement segmentation (IoU: 43.9% test, 48.2% val), highlighting the effectiveness of weighting and false-negative sensitivity in loss formulation.
4.4. Real-World Evaluation on Annotated Viaduct Images
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hubel, D.H.; Wiesel, T.N. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 1968, 195, 215–243. [Google Scholar] [CrossRef]
- Werbos, P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Science. Ph.D. Thesis, Applied Mathemathics Harvard University, Cambridge, MA, USA, 1974. [Google Scholar]
- Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar] [CrossRef] [PubMed]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv 2016, arXiv:1412.7062. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv 2017, arXiv:1606.00915. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Ros, G.; Sellart, L.; Materzynska, J.; Vazquez, D.; Lopez, A.M. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3234–3243. [Google Scholar] [CrossRef]
- Spencer, B.F.; Hoskere, V.; Narazaki, Y. Advances in Computer Vision-Based Civil Infrastructure Inspection and Monitoring. Engineering 2019, 5, 199–222. [Google Scholar] [CrossRef]
- Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-Driven Structural Health Monitoring and Damage Detection through Deep Learning: State-of-the-Art Review. Sensors 2020, 20, 2778. [Google Scholar] [CrossRef]
- Bao, Y.; Chen, Z.; Wei, S.; Xu, Y.; Tang, Z.; Li, H. The State of the Art of Data Science and Engineering in Structural Health Monitoring. Engineering 2019, 5, 234–242. [Google Scholar] [CrossRef]
- Bianchi, E.; Hebdon, M. Visual structural inspection datasets. Autom. Constr. 2022, 139, 104299. [Google Scholar] [CrossRef]
- Bhowmick, S.; Nagarajaiah, S.; Veeraraghavan, A. Vision and Deep Learning-Based Algorithms to Detect and Quantify Cracks on Concrete Surfaces from UAV Videos. Sensors 2020, 20, 6299. [Google Scholar] [CrossRef] [PubMed]
- Cha, Y.J.; Ali, R.; Lewis, J.; Büyüköztürk, O. Deep learning-based structural health monitoring. Autom. Constr. 2024, 161, 105328. [Google Scholar] [CrossRef]
- Gao, Y.; Yang, J.; Qian, H.; Mosalam, K.M. Multiattribute multitask transformer framework for vision-based structural health monitoring. Comput.-Aided Civ. Infrastruct. Eng. 2023, 38, 2358–2377. [Google Scholar] [CrossRef]
- Azimi, M.; Yang, T.Y. Transformer-based framework for accurate segmentation of high-resolution images in structural health monitoring. Comput.-Aided Civ. Infrastruct. Eng. 2024, 39, 3670–3684. [Google Scholar] [CrossRef]
- Shahin, M.; Chen, F.F.; Maghanaki, M.; Hosseinzadeh, A.; Zand, N.; Koodiani, H.K. Improving the Concrete Crack Detection Process via a Hybrid Visual Transformer Algorithm. Sensors 2024, 24, 3247. [Google Scholar] [CrossRef]
- Shen, Y.; Yu, Z.; Li, C.; Zhao, C.; Sun, Z. Automated Detection for Concrete Surface Cracks Based on Deeplabv3+ BDF. Buildings 2023, 13, 118. [Google Scholar] [CrossRef]
- Yuan, H.; Jin, T.; Ye, X. Modification and Evaluation of Attention-Based Deep Neural Network for Structural Crack Detection. Sensors 2023, 23, 6295. [Google Scholar] [CrossRef]
- Hang, J.; Wu, Y.; Li, Y.; Lai, T.; Zhang, J.; Li, Y. A deep learning semantic segmentation network with attention mechanism for concrete crack detection. Struct. Health Monit. 2023, 22, 3006–3026. [Google Scholar] [CrossRef]
- Zhou, Y.; Li, C.; Wang, S.; Peng, G.; Ma, S.; Yang, Z.; Feng, Y. Crack Detection of the Urban Underground Utility Tunnel Based on Residual Feature Pyramid Attention Network. KSCE J. Civ. Eng. 2024, 28, 2778–2787. [Google Scholar] [CrossRef]
- Ali, L.; AlJassmi, H.; Swavaf, M.; Khan, W.; Alnajjar, F. Rs-net: Residual Sharp UNet architecture for pavement crack segmentation and severity assessment. J. Big Data 2024, 11, 116. [Google Scholar] [CrossRef]
- Cheng, H.; Chai, W.; Hu, J.; Ruan, W.; Shi, M.; Kim, H.; Cao, Y.; Narazaki, Y. Random bridge generator as a platform for developing computer vision-based structural inspection algorithms. J. Infrastruct. Intell. Resil. 2024, 3, 100098. [Google Scholar] [CrossRef]
- Parslov, J.; Gintare, K.; Sommer, L. CrashCar101: Procedural Generation for Damage Assessment. arXiv 2024, arXiv:2311.06536. [Google Scholar]
- Dondi, A.; Di Gangi, L.; Galati, N. Improving Post-Earthquake Crack Detection using Semi-Synthetic Images. arXiv 2024, arXiv:2412.05042. [Google Scholar]
- Nowacka, A.; Kamiński, M.; Koziarski, M. Segmentation of Cracks in 3D Images of Fiber Reinforced Concrete Using Deep Learning. arXiv 2025, arXiv:2501.18405. [Google Scholar]
- Jaziri, F.; Fathima, A.; von Viebahn, C.; Leonhardt, S. Designing a Hybrid Neural System to Learn Real-world Crack Segmentation from Fractal-based Simulation. arXiv 2023, arXiv:2309.09637. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. arXiv 2023, arXiv:2304.02643. [Google Scholar]
- Narazaki, Y.; Hoskere, V.; Yoshida, K.; Spencer, B.F.; Fujino, Y. Synthetic environments for vision-based structural condition assessment of Japanese high-speed railway viaducts. Mech. Syst. Signal Process. 2021, 160, 107850. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Cham, Switzerland, 2018; pp. 833–851. [Google Scholar]
Class | Class Acc. [%] | Precision [%] | Recall [%] | F1 Score [%] | IoU [%] |
---|---|---|---|---|---|
Background | 98 | 98 | 99 | 99 | 98 |
Cracks | 98 | 53 | 45 | 61 | 32 |
Reinforcement | 100 | 19 | 21 | 0 | 11 |
Class | Class Acc. [%] | Precision [%] | Recall [%] | F1 Score [%] | IoU [%] |
---|---|---|---|---|---|
Background | 98 | 99 | 99 | 99 | 98 |
Cracks | 99 | 61 | 69 | 65 | 48 |
Reinforcement | 00 | 79 | 49 | 61 | 44 |
Loss | Class | Class Acc. | Precision | Recall | F1 Score | IoU |
---|---|---|---|---|---|---|
CCE [%] | Background | 97 | 98 | 99 | 99 | 97 |
Cracks | 98 | 0.37 | 31 | 39 | 20 | |
Reinforcement | 99 | 0.19 | 25 | 29 | 12 | |
Tversky [%] | Background | 98 | 99 | 99 | 99 | 98 |
Cracks | 98 | 57 | 61 | 59 | 42 | |
Reinforcement | 100 | 67 | 47 | 55 | 38 |
Loss | Split | Class | IoU | F1 Score | Precision | Recall |
---|---|---|---|---|---|---|
CCE | Train | Background | 98.11 | 99.28 | 98.68 | 99.41 |
Cracks | 39.42 | 71.07 | 59.38 | 53.97 | ||
Reinforcement | 8.55 | 0.00 | 12.85 | 20.32 | ||
Validation | Background | 98.04 | 99.28 | 98.66 | 99.36 | |
Cracks | 38.65 | 69.54 | 58.92 | 52.90 | ||
Reinforcement | 9.86 | 0.00 | 15.41 | 21.50 | ||
Tversky | Train | Background | 98.15 | 99.07 | 98.99 | 99.15 |
Cracks | 35.87 | 53.02 | 45.69 | 62.55 | ||
Reinforcement | 24.31 | 39.65 | 41.33 | 37.12 | ||
Validation | Background | 98.25 | 99.12 | 99.04 | 99.20 | |
Cracks | 49.16 | 65.93 | 59.16 | 74.41 | ||
Reinforcement | 48.23 | 65.08 | 75.27 | 57.32 |
Class | Class Acc. [%] | Precision [%] | Recall [%] | F1 Score [%] | IoU [%] |
---|---|---|---|---|---|
Non-structural | 50 | 40 | 52 | 45 | 29 |
Slab | 51 | 39 | 26 | 31 | 19 |
Beam | 83 | 08 | 14 | 10 | 05 |
Column | 88 | 37 | 27 | 32 | 19 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tauzowski, P.; Ostrowski, M.; Bogucki, D.; Jarosik, P.; Błachowski, B. Structural Component Identification and Damage Localization of Civil Infrastructure Using Semantic Segmentation. Sensors 2025, 25, 4698. https://doi.org/10.3390/s25154698
Tauzowski P, Ostrowski M, Bogucki D, Jarosik P, Błachowski B. Structural Component Identification and Damage Localization of Civil Infrastructure Using Semantic Segmentation. Sensors. 2025; 25(15):4698. https://doi.org/10.3390/s25154698
Chicago/Turabian StyleTauzowski, Piotr, Mariusz Ostrowski, Dominik Bogucki, Piotr Jarosik, and Bartłomiej Błachowski. 2025. "Structural Component Identification and Damage Localization of Civil Infrastructure Using Semantic Segmentation" Sensors 25, no. 15: 4698. https://doi.org/10.3390/s25154698
APA StyleTauzowski, P., Ostrowski, M., Bogucki, D., Jarosik, P., & Błachowski, B. (2025). Structural Component Identification and Damage Localization of Civil Infrastructure Using Semantic Segmentation. Sensors, 25(15), 4698. https://doi.org/10.3390/s25154698