1. Introduction
Structural cracks are typically caused by the stress within a structure reaching its limit, resulting in insufficient load-bearing capacity. These cracks may be small initially, but over time, they can expand, compromising structural integrity. Ultimately, these cracks may lead to localized or overall instability, potentially even causing the collapse of the structure. By regularly monitoring crack development [
1], potential structural issues can be promptly identified, preventing potential instability or collapse. Therefore, it is necessary to conduct further analysis of these cracks to ensure the safety and stability of the structural framework.
Traditional crack detection methods primarily rely on manual operations. People utilize crack width gauges, ultrasonic detection devices, physical rulers, and measuring tapes to directly measure the condition of cracks on the structure’s surface at the engineering site. However, manual measurement is subjective, not only involving high costs and relatively low operational efficiency but also making it difficult to ensure the objectivity and accuracy of the results. In recent years, with the rapid advancement of computer science and technology, machine vision has demonstrated tremendous potential and advantages for crack detection and analysis. This technology enables high-precision segmentation of structural images, accurately extracts crack morphological features, and provides a solid data foundation for quantitative analysis of cracks.
Existing image processing methods for structural crack detection generally fall into two categories: Digital Image Processing (DIP) [
2] and deep-learning-based large models [
3,
4,
5,
6]. While DIP techniques—such as global/adaptive thresholding, edge detection (e.g., Sobel, Canny), morphological operations, region growing, and histogram equalization—are widely used, they face significant challenges in real-world engineering environments. Crack images often suffer from complex background textures, uneven illumination, dynamic grayscale variations, and low foreground–background contrast. Specifically, global thresholding is overly sensitive to illumination, failing when grayscale distributions overlap; edge detection is prone to noise-induced boundary discontinuities; and morphological operations rely on rigid, manually defined structuring elements that cannot adapt to irregular crack widths. Additionally, contrast enhancement techniques, such as histogram equalization, often degrade segmentation accuracy by amplifying background noise alongside crack features.
Deep-learning-based methods rely on large-scale annotated datasets to automatically learn hierarchical feature representations and demonstrate strong robustness to complex backgrounds and illumination variations, achieving significant success in crack monitoring. However, the performance of these methods heavily depends on the availability of large and accurately labeled datasets [
7,
8]. The collection and annotation of crack data are time-consuming and labor-intensive, and some types of crack samples are relatively scarce. In addition, significant variations in the morphology and characteristic parameters of the structural cracks further complicate model generalization.
Even with sufficient training data, deep-learning models often require complex network architectures, substantial computational resources, and long training times. In contrast, the method proposed in this paper does not rely on model training. Therefore, it offers advantages in terms of computational efficiency, interpretability, and ease of engineering deployment, making it more suitable for practical engineering scenarios, especially in resource-constrained environments.
After reviewing the current status of structural crack databases, it becomes evident that there is a significant shortage of database resources specifically targeting certain types of structures, such as tunnel cracks or geological borehole cracks. This situation hampers the effective development and application of deep-learning models in this field. Moreover, existing image enhancement and segmentation techniques generally lack effective adaptive parameter-tuning mechanisms when faced with diverse crack images, making it difficult to accurately adapt to their complex, varied characteristics. Therefore, this study proposes a novel crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement, and further extends it to dynamic monitoring of crack development. In summary, the contributions of this work can be outlined as follows:
This paper proposes a CSA-BB module for crack segmentation based on bimodal image fusion and brightness piecewise linear enhancement. The algorithm enhances the detailed information of the crack region using a pseudo-color image. It adaptively selects appropriate enhancement strategies based on the image’s brightness level, thereby achieving effective crack segmentation.
An adaptive brightness piecewise linear function is designed to address the significant grayscale variations caused by diverse illumination intensities and complex backgrounds. This module dynamically extracts the global brightness value of the input image. It adaptively selects the optimal adjustment parameters for the piecewise linear function, effectively highlighting the target crack regions against severe environmental interference.
A complete automated monitoring framework is established by extracting three key physical parameters of the segmented cracks. This provides critical quantitative support for safety assessment and decision-making regarding reinforcement of engineering structures. Furthermore, extensive comparative and ablation experiments on diverse crack datasets comprehensively validate the superior segmentation performance of the proposed CSA-BB algorithm and the effectiveness of its core modules.
The remaining sections of this paper are organized as follows.
Section 2 provides a brief overview of related work in crack analysis.
Section 3 introduces the crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement, along with crack development monitoring.
Section 4 analyzes the experimental results. Finally,
Section 5 summarizes the work presented in this paper.
2. Related Work
Many scholars began researching structural cracks using image processing technology as early as the late 1980s. Early methods for crack analysis included image denoising, image enhancement, and image segmentation. With the continuous development of deep learning, Convolutional Neural Networks (CNNs) have gradually been applied to crack monitoring and classification tasks. This section reviews relevant research on crack analysis based on machine vision.
Image enhancement is an important research direction in image processing. By enhancing the crack region in an image, subsequent crack detection and analysis can be performed more accurately. As early as 2014, Dou et al. [
9] proposed a crack enhancement method based on feature structure coherence histogram equalization. This method enhances the visibility of cracks by adjusting the image’s grayscale distribution, thereby improving overall contrast. Subsequently, Ma et al. [
10] improved the traditional histogram equalization method and proposed a different color-space fusion algorithm based on Contrast Limited Adaptive Histogram Equalization (CLAHE). This algorithm enhances local contrast while preserving global contrast, thereby reducing noise introduced by excessive enhancement. As research continued to advance, Yang et al. [
11] proposed an enhancement method for asphalt pavement crack images using guided filtering and Retinex. They employed a two-dimensional discrete wavelet transform for image denoising and compression, and processed the wavelet’s low-frequency coefficients using a combination of guided filtering and Multi-Scale Retinex with Color Restoration (MSRCR). These methods mitigate noise to some extent, but due to their lack of adaptability, they struggle to automatically adjust processing parameters across different types of structural crack images.
In image segmentation, early methods included threshold-based segmentation [
12,
13,
14,
15,
16], edge-detection-based segmentation [
12,
17,
18,
19,
20], and region-growing-based segmentation [
21]. Although these methods are simple and effective, their performance is limited when applied to crack images with significant noise. To address the noise problem, Chen et al. [
22] proposed a method for detecting potential crack regions based on multi-threshold image processing. This algorithm combines global and local thresholding to segment the image, removing noise and small regions to extract potential crack areas. Xue et al. [
23] proposed a dynamic segmentation Gaussian crack detection algorithm for tunnel lining crack segmentation based on the distribution of projection curves. This algorithm introduces a novel Dynamic Segmentation Gaussian (DSG) model, which incorporates threshold factors corresponding to different crack widths and multi-scale Gaussian factors into the Gaussian model to achieve crack segmentation. It effectively mitigates the impact of uneven illumination on crack detection. These algorithms perform well in their specific application scenarios but may struggle with diverse, complex images.
Parameter extraction is an essential step in crack monitoring systems. Li et al. [
24] proposed a method for extracting crack image features based on mathematical morphology and connected-domain thresholding. Through operations such as morphological filtering, skeletonization, and region labeling, the geometric features of cracks are extracted. Tang et al. [
25] proposed a novel crack-skeleton refinement algorithm and a width-measurement scheme based on backbone dual-scale features. This algorithm simplifies the redundant data in crack images and improves the efficiency of crack shape estimation. Peng et al. [
26] proposed a method for identifying bridge cracks and quantifying their widths using drones and a hybrid feature learning approach. Using corresponding distance data, the actual width of the bridge cracks was measured and quantified using a distance-measurement method. This approach effectively calculates the width of bridge cracks.
Traditional crack-detection methods exhibit both advantages and limitations. Grounded in explicit mathematical models and classical image processing theories, they offer clear algorithmic structures, easy implementation, and low computational costs, providing strong interpretability and engineering practicality. Using techniques like contrast enhancement, thresholding, edge detection, and morphological analysis, these methods achieve effective crack detection and feature extraction in environments with simple backgrounds and uniform lighting. Their strengths primarily lie in low cost, rapid deployment, and applicability to small-sample scenarios. Nevertheless, their reliance on handcrafted features and empirical parameters makes them highly susceptible to illumination changes, noise, and complex backgrounds, leading to frequent false positives and false negatives. Additionally, the step-by-step processing pipeline can lead to error accumulation, limiting its generalization across diverse, large-scale datasets. Ultimately, while traditional methods hold significant value in early engineering deployments, they fall short in complex environmental adaptability and intelligent automation.
With the continuous advancement of deep learning, Convolutional Neural Networks (CNNs) and their derivatives have driven significant progress in crack image segmentation. Recently, various models have shown robust segmentation performance on public datasets, namely Crack500 [
27], DeepCrack [
28], and CFD [
29], as compared in
Table 1. For example, GAF-Net [
30] enhances multi-scale feature extraction to achieve a high recall rate for complex crack morphologies, reaching an F1-score of 0.931 on the DeepCrack [
28] dataset, which indicates strong crack-capturing capabilities. However, its complex structure and high computational overhead limit its effectiveness in low-contrast environments. Meanwhile, CrackDiff [
31] utilizes diffusion modeling for fine-grained crack reconstruction, achieving high IoU scores across multiple datasets, though at the cost of slow inference and high training complexity. PAFNet [
32] employs a lightweight architecture that ensures stability in cross-dataset evaluations, yet it lacks robustness against illumination changes and struggles with boundary refinement. Conversely, CrackW-Net [
33] offers a simpler structure with fewer parameters, but yields lower overall segmentation accuracy and struggles with complex backgrounds.
Despite these accuracy improvements, the performance of deep-learning methods remains highly dependent on massive, high-quality annotated datasets. Data for specific crack types, such as tunnel or unique structural cracks, is often scarce. Additionally, real-world engineering images are frequently plagued by uneven lighting, complex backgrounds, and significant distribution shifts. Ultimately, these challenges degrade model generalization and training efficacy, hindering the widespread deployment of deep-learning models in practical engineering applications.
To address the issue of crack analysis in different structures under complex backgrounds, this paper proposes a novel structural crack-segmentation algorithm. This algorithm can adjust enhancement parameters based on image brightness and reduce the impact of lighting variations on processing outcomes through image fusion, thereby enabling effective crack analysis and monitoring. Through this algorithm, post-earthquake structural damage can be accurately identified and assessed, aiding in the rapid formulation of repair and reinforcement plans. This effectively prevents secondary disasters, thereby ensuring the safety and stability of engineering structures after an earthquake.
3. Methodology
The crack development monitoring system presented in this paper is primarily implemented through two steps: crack segmentation and development monitoring. In the crack image segmentation phase, this paper proposes a crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement (CSA-BB) to achieve precise segmentation of the crack region. In the crack development monitoring phase, this paper measures various crack parameters. It determines the safety level of cracks using predefined evaluation criteria, enabling effective monitoring and assessment of crack development.
3.1. Crack Segmentation
3.1.1. Overview of the CSA-BB Algorithm
The flowchart of the crack-segmentation algorithm proposed in this paper is shown in
Figure 1 and consists of three steps: bimodal image fusion, brightness piecewise linear enhancement, and crack region segmentation. Firstly, this paper employs the Jet colormap method to generate a pseudo-color image from the original image, and uses a fusion function constructed in this study to merge the two images, resulting in a fused image. Then, this paper calculates the global brightness of the original image based on the brightness values of its red, green, and blue channels, and a brightness piecewise linear function is constructed using this global brightness to perform image enhancement. Finally, this paper enhances the crack region using the bottom-hat [
34] transform and achieves accurate segmentation of crack areas through OTSU binarization and morphological closing.
3.1.2. Bimodal Image Fusion
Multimodal image processing refers to the process of analyzing, fusing, or processing image data obtained from different sensors or modalities. In this process, image data from different modalities are regarded as complementary information sources, which are comprehensively utilized to enhance the performance of image analysis and understanding. This paper draws on the idea of multimodal image fusion, utilizing the fusion of pseudo-color images with the original images. Pseudo-color images typically have high color contrast and vivid colors. This diversity makes the images visually richer, allowing more detailed features to be presented. Therefore, this fusion method preserves the original information carried by the original image while incorporating the color information from the pseudo-color image. This highlights the crack region in the image, resulting in better crack segmentation performance.
The process of the bimodal image fusion strategy proposed in this paper is shown in
Figure 2 and consists of three main steps as follows:
Generating a pseudo-color image: This paper converts the original image to grayscale and produces a pseudo-color image using the Jet colormap.
Calculating the Bhattacharyya Coefficient (BC): This paper obtains the grayscale histograms of the pseudo-color image and the original image, and then computes the BC based on these histograms.
Constructing the bimodal fusion image: This paper constructs a fusion function based on the BC and utilizes both the original image and the pseudo-color image to generate the fused image.
Figure 2.
Flowchart of the bimodal image fusion method.
Figure 2.
Flowchart of the bimodal image fusion method.
(1) Generating the pseudo-color image
The paper obtains the pseudo-color image of the crack grayscale image using the Jet colormap technique. This mapping method is based on variations in hue, mapping grayscale levels to different colors. Consequently, different grayscale values show obvious color differences in the pseudo-color image. The calculation formulas, derived from the mathematical implementation of MATLAB’s built-in jet colormap function [
35], are shown in Equations (1)–(3):
where
x denotes the input grayscale value, and
r(
x),
g(
x), and
b(
x) represent the color values of the red, green, and blue channels, respectively. These three functions define how different grayscale levels are assigned values in the red, green, and blue channels, respectively. Once each channel is assigned specific values, the Jet colormap is achieved. The effects are shown in
Figure 3b.
(2) Calculating the BC
Crack images typically contain a certain amount of noise interference, which can affect the accuracy of image similarity [
11] measurements. The
BC, by statistically analyzing the overall distribution of image color and grayscale values, can effectively reduce the impact of noise on image similarity. Therefore, this paper uses the
BC of the image histograms as a measure of the similarity between two images. The
BC ranges from 0 to 1, where 0 indicates that the two images are completely dissimilar, and 1 indicates that the two images are completely identical. The calculation formula for the
BC is shown in Equation (4):
where
I represents the grayscale level of each pixel in the image, and
pi and
qi denote the probability of the
i-th grayscale level occurring in the histograms of the two images, respectively.
BC refers to the Bhattacharyya coefficient.
(3) Constructing the bimodal fusion image
To achieve the fusion of the pseudo-color image with the original image, this paper constructs a fusion function. The goal of this function is to highlight the detailed features of the crack while preserving the original information of the image using appropriate fusion weights. Determining the fusion weights through the
BC helps balance the information from the pseudo-color image and the original image, ensuring that the fusion result retains both the enhanced features from the pseudo-color image and the true details from the original image. The bimodal fusion function constructed in this paper is shown in Equation (5):
where
A (
x,
y) and
B (
x,
y) represent the pixel values of the pseudo-color image and the original image at the coordinates (
x,
y), respectively.
A and
β are their respective fusion weights.
BC1 is the Bhattacharyya coefficient between the pseudo-color image and the original image, and
BC2 is the Bhattacharyya coefficient between the two original images.
F (
x,
y) denotes the pixel value of the fused image at the coordinates (
x,
y). If
BC1 <
BC2, it indicates that the original image contains more original information. By using
BC1 as the numerator of the fusion weight for the pseudo-color image to assign it a lower weight, and
BC2 as the numerator of the fusion weight for the original image to assign it a higher weight, this information can be preserved in the fused image. The results are shown in
Figure 4. It can be observed that the bimodal image fusion strategy enhances the visual effect of the image and highlights more details and features of the crack.
3.1.3. Brightness Piecewise Linear Enhancement
Due to various differences during the acquisition of crack images, such as illumination intensity, background color, and image resolution, these differences result in significant grayscale magnitude variations in the grayscale results of different fused images. Using fixed-parameter image enhancement techniques to enhance the crack region yields poor results. To address this issue, this paper proposes a brightness-based piecewise linear function that adaptively selects adjustment parameters based on the image’s brightness, thereby enhancing the image to highlight the crack region. The function is shown in Equation (6):
where
f (
x,
y) represents the original grayscale value of the pixel at coordinates (
x,
y) in the image,
g (
x,
y) represents the transformed grayscale value of
f (
x,
y),
T represents the global brightness of the image, which is the average of the brightness values of all pixels in the image, reflecting the overall brightness of the image. The global brightness
T of the image can be obtained according to Equation (7):
where
W and
H are the width and height of the original image,
n is the total number of pixels in the image, and
R (
x,
y),
G (
x,
y), and
B (
x,
y) represent the brightness values of the red, green, and blue channels at the point (
x,
y) in the image, respectively. The function adaptively selects adjustment parameters based on the brightness
T. In this paper, a threshold
Br is set. If
T <
Br, the parameters [
α;
β;
γ;
δ] = [1.2; −112; −0.6; 331]; If
T >
Br, the parameters [
α;
β;
γ;
δ] = [0.2; 90; −1.2; 394]. The polyline graph of the brightness piecewise linear function is shown in
Figure 5.
The fused images are converted to grayscale, resulting in the grayscale images shown in
Figure 6b. Then, the grayscale images are enhanced using the brightness-enhancing piecewise linear function proposed in this study, as shown in
Figure 6c. According to the comparison, despite varying illumination intensities and noise interference in the original crack images, the brightness piecewise linear function enhances the crack details and highlights their contrast.
3.1.4. Crack Region Segmentation
To separate the crack region from the background in the images and further obtain statistical information and features about the crack, this paper performs a crack segmentation operation. This section mainly includes three steps: bottom-hat transformation, OTSU algorithm, and morphological closing operation. The bottom-hat transformation effectively extracts small features in the image that are darker than the background in the target area. The OTSU algorithm can automatically calculate an optimal threshold to effectively separate the crack from the background in an image, thus avoiding the subjectivity and uncertainty associated with manual threshold selection. Finally, this paper employs a morphological closing operation to connect intermittent cracks and smooth their boundaries.
The principle of the bottom-hat transformation involves subtracting the original image from the result of the closing operation, thereby obtaining the valley portions filled by the closing operation. In the crack image, these valley portions typically correspond to the darker crack region, known as the “black bottom-hat.” The definition of the bottom-hat transformation is shown in Equation (8):
where
I represents the original image, and
B is the structuring element, which is a small, predefined binary matrix. Specifically, in this study, a disk-shaped structuring element with a radius of 5 pixels is employed as B to effectively enhance the dark crack features.
I*
B denotes the closing operation,
I” is the image after the bottom-hat transformation. The results of the bottom-hat transformation are shown in
Figure 7b. As depicted in the figures, the bottom-hat transformation effectively suppresses the background regions. It enhances the details of darker parts of images, making features such as crack morphology, orientation, and distribution patterns clearer and more visible.
The process of binarizing an image using the OTSU algorithm primarily involves finding a threshold
T that maximizes the inter-class variance. By maximizing the inter-class variance, this study achieves optimal segmentation of the crack. The formula for inter-class variance is shown as Equation (9):
where
Na is the number of foreground pixels,
Nb is the number of background pixels,
μa is the average grayscale value of the foreground,
μb is the average grayscale value of the background,
N represents the total number of pixels in the image, and
σ2(
T) represents the inter-class variance. By iterating through all possible thresholds
T according to the calculation formula for inter-class variance, the threshold that maximizes the inter-class variance is identified as the optimal threshold. The results of image binarization are shown in
Figure 8b. It can be seen that the OTSU algorithm effectively separates the cracks and background regions while suppressing noise interference.
In morphological processing, the closing operation can be described as performing a dilation followed by an erosion on the input image. This process fills small holes within objects in the image. In this paper, the closing operation is used to connect the broken parts within the fracture region and smooth the object boundaries. The expression for the closing operation is shown in Equation (10).
where
Iφ(
x) represents the image after the closing operation,
I(
x) represents the original binarized image,
δB denotes the erosion operation, and
εB denotes the dilation operation. The results of the closing operation are shown in
Figure 9b. It can be observed that the closing operation fills the gaps within the cracks, making them more continuous and complete, thereby achieving effective segmentation of the crack regions.
3.2. Monitoring Crack Development
By analyzing crack parameters, the structural integrity risk is predicted, enabling systematic monitoring of cracks. Monitoring data can provide a scientific basis for engineering management and maintenance decisions, helping to formulate rational maintenance plans to ensure the stability and safety of structures. In this section, crack parameters are obtained through two steps: crack region labeling and crack parameter extraction.
(1) Crack region labeling
In this study, the eight-connected region labeling method is employed to extract multiple crack regions from a segmented image. This method visualizes connected regions, allowing for clearer observation of the distribution, morphology, and topological structure of cracks. It helps identify potential structural features and regular patterns. In this paper, a threshold
T (set to 15) is used to eliminate small interference regions with pixel counts below
T, as indicated by the blue regions in
Figure 10b. The connected region labeling method can assign a unique label to each crack, enabling the counting and localization of cracks, and facilitating further quantitative analysis.
(2) Crack parameter extraction
In the crack evaluation system, different crack parameters represent various properties of cracks. Many previous studies have evaluated crack development by introducing geometric parameters such as crack area, length, and width. However, due to differences in the imaging field of view during image acquisition, the scale of cracks may vary among images. In addition, variations in shooting distance and angle may further cause differences in the size and scale of each acquired crack image, making it difficult to objectively evaluate the influence of crack area on the entire structure. To reduce the influence of these factors, this study adopts ratio-based indicators to describe crack characteristics. Specifically, the crack area reflects the scale of the crack, the crack length indicates the extent of crack propagation, and the crack width represents the degree of crack opening. Therefore, this study assesses the damage degree of the structure and determines its safety level by calculating three parameters: crack area ratio, crack length ratio, and crack width ratio.
(1) Crack area ratio
The crack area is a paramount indicator of structural deterioration, as it reflects a reduction in the effective cross-sectional area and bearing capacity. Generally, a larger crack area signifies more severe internal damage. Therefore, this study assesses the degree of structural damage by calculating the crack area ratio. The crack area ratio can be computed using Formula (11):
where
n represents the total number of independent wall crack regions detected in the image; this value is automatically determined by counting the isolated connected components in the binary segmentation map rather than being manually selected.
M represents the number of pixels in each crack region, and
W and
H denote the width and height of the original image, respectively.
(2) Crack length ratio
Crack length is a critical metric in crack detection, often associated with stress concentration zones. Long cracks typically occur at stress concentration points, such as structural joints or areas of concentrated load. In this study, morphological operations are used to iteratively refine the binary image, gradually removing edge pixels of cracks until obtaining a one-pixel-wide crack skeleton, as shown in
Figure 11. As expressed in Equation (12),
provides a scale-invariant metric by calculating the ratio of the total skeleton pixel length to the maximum diagonal length of the image.
where
n represents the number of skeletons, and
qi represents the number of pixels in the
i-th skeleton.
(3) Crack width ratio
By measuring and recording crack widths, the structural safety is assessed, and appropriate repair and reinforcement measures are implemented. In this study, the average crack width is calculated using the obtained crack area and length. The formula for the crack width ratio is shown as Equation (13):
4. Experiments
To verify the effectiveness of the proposed CSA-BB algorithm, this section applies it to a total of 50 sample images selected as experimental data from four widely used datasets: CFD [
29], DeepCrack [
28], Cracktree200 [
36], and CRACK500 [
27]. These datasets collectively cover a wide range of diverse crack morphologies, varying illumination conditions, and complex backgrounds. Subsequently, this section sets up comparative experiments to compare the segmentation performance of this algorithm with four other segmentation algorithms. Additionally, this section includes modular ablation experiments to verify the effectiveness and efficiency of the bimodality image fusion and brightness piecewise linear enhancement modules.
All experiments are conducted using MATLAB R2024a on a Windows 10 platform, powered by an Intel Core i7-11800H processor and an NVIDIA GeForce RTX 3060 GPU with 6 GB of VRAM, supported by 16 GB of system memory.
4.1. Comparison of Crack Segmentation Results
In this section, four different types of structural crack image samples are selected as experimental subjects to evaluate the segmentation performance of the proposed CSA-BB algorithm. Simultaneously, the segmentation performance of the proposed algorithm is compared with that of four other crack-segmentation algorithms, as shown in
Figure 12.
As shown in
Figure 12, the performance of different conventional methods varies significantly in the task of crack segmentation under complex background conditions. The OTSU thresholding method achieves relatively satisfactory results when the background is homogeneous and noise interference is limited, as illustrated in Sample 3 (c) and Sample 4 (c), where the crack regions can be largely separated. However, when severe grayscale overlap exists between cracks and the background, accompanied by strong local noise, its segmentation capability deteriorates considerably. In Sample 1 (c) and Sample 2 (c), problems such as crack discontinuity, adhesion to the background, and false segmentation are observed, making reliable crack extraction difficult.
Although the Canny edge detection algorithm exhibits a certain degree of noise suppression, it tends to produce unstable edge responses. In particular, blurred and fragmented crack boundaries can be observed in Sample 2 (d) and Sample 3 (d), preventing the formation of complete and precise crack contours and thus limiting subsequent geometric and morphological parameter analysis. The segmentation results obtained using the Laplacian of Gaussian (LoG) operator for all four samples are characterized by dense noise responses and poor discrimination between cracks and background, leading to unsatisfactory segmentation quality. Similarly, the Prewitt operator shows deficiencies in detail preservation; in Sample 4 (f), fine crack branches and local structural features are noticeably lost, compromising the integrity of the crack representation.
In contrast, the proposed method demonstrates consistently stable and superior segmentation performance across all four samples. Regardless of whether the background is uniform or characterized by severe grayscale overlap and strong noise interference, the proposed approach effectively suppresses background disturbances while preserving clear and continuous crack boundaries. As shown in Sample 1 (g) to Sample 4 (g), both the main crack structures and fine details are well maintained, with significantly reduced false detections. These results indicate that the proposed method outperforms OTSU, Canny, LoG, and Prewitt in terms of noise robustness and adaptability to complex scenes, enabling more reliable crack segmentation and providing a solid foundation for subsequent crack parameter extraction and evolution monitoring.
As shown in
Table 2, the proposed method achieves the best overall performance, with an average Dice coefficient of 0.4511 and an average Jaccard index of 0.2981, which are clearly higher than those of Prewitt (0.2119/0.1258), OTSU (0.0894/0.0470), Canny (0.3556/0.2217), and LoG (0.2921/0.1745). These results indicate that the proposed method has a more pronounced advantage in terms of crack-region extraction accuracy and overlap with the manual annotations.
A further analysis of individual samples shows that the proposed method performs particularly well on Sample 1 and Sample 2. For instance, on Sample 1, the proposed method achieves a Dice score of 0.4834, which is substantially higher than those of LoG (0.3281) and Canny (0.1675). On Sample 2, its Dice score reaches 0.5964, exceeding that of the second-best method, Canny (0.3993), by nearly 0.20. This demonstrates that the proposed method is capable of preserving the main crack structure and effectively suppressing background interference, even in the presence of relatively complex backgrounds and thin, elongated cracks. For Sample 3, however, the Dice score of the proposed method is 0.2794, which is lower than that of LoG (0.4030) and Canny (0.3836), suggesting that its response to locally weak or fine crack details is somewhat limited in this case. For the last sample, Canny achieves the best result, with a Dice score of 0.4717, slightly higher than that of the proposed method (0.4450), indicating that traditional edge-detection methods may still retain certain advantages when crack boundaries are clear and background interference is relatively low.
In summary, the proposed method not only yields the best average performance but also demonstrates better robustness and consistency across most individual samples. Although it is outperformed by Canny or LoG on a few specific cases, its overall segmentation performance and practical effectiveness indicate that it is more suitable for crack segmentation tasks under complex background conditions.
As illustrated in
Figure 13, the proposed CSA-BB algorithm demonstrates highly robust segmentation performance across a variety of challenging conditions. Row (a) presents seven distinct original structural crack images, deliberately selected to showcase significant individual variations in color, texture, and background brightness, including uneven illumination and complex surface materials. Row (b) displays the corresponding binary segmentation masks generated by the proposed algorithm. It is visually evident that despite the multi-feature disparities and complex backgrounds, the CSA-BB algorithm successfully overcomes these interferences to precisely segment the crack targets. The clear and continuous extraction of the crack networks confirms that the bimodal image fusion and brightness piecewise linear enhancement modules effectively adapt to varying environmental conditions, fulfilling the demand for robust identification in real-world engineering scenarios.
As illustrated in
Figure 14, the proposed CSA-BB algorithm demonstrates stable crack segmentation performance under noise interference. Row (a) displays the original crack samples, while Row (b) shows the same images corrupted by salt-and-pepper noise. This noise introduces grayscale overlap between the cracks and the background, a condition that typically causes false segmentation in conventional methods. However, as shown in Sample 1 (c) through Sample 4 (c), the proposed approach effectively suppresses background noise while preserving continuous crack boundaries. The algorithm extracts both the main crack structures and local branches without producing fragmented contours or isolating noise artifacts. These results indicate that the bimodal image fusion and enhancement modules preserve target integrity, providing a reliable basis for subsequent geometric parameter extraction.
4.2. Crack Monitoring
To assess the safety and stability of structures based on crack monitoring data, and to develop scientifically reasonable maintenance plans to delay structural aging. This paper has established three levels of crack hazard according to crack parameters [
37,
38] for crack monitoring. The parameter thresholds can be customized according to specific structural types and engineering inspection requirements.
(2) Crack to be Monitored
Due to variations in shooting angles and scales among different crack images, it is impossible to standardize the units for measuring crack length and width. Therefore, this paper uses pixel count (pc) as the parameter unit for evaluating crack length and width. Crack parameters and risk levels are shown in
Table 3. For Sample 1, the crack area ratio reaches 1.7%, and the normalized crack length ratio is 102%. Since both parameters exceed the established safe thresholds, Sample 1 is classified as a “Hazardous Crack”. Similarly, Samples 2 and 4 partially exceed the safe criteria, meeting the conditions for “Cracks to be Monitored,” and are thus classified as such. Sample 3 satisfies all three criteria for a “Safe Crack,” and is classified accordingly. Seen in conjunction with
Figure 12, in Sample 1, cracks bifurcate in the middle section into two branches extending left and right, with a significant width, rendering the entire structure more unstable. In contrast, cracks in Sample 4 also bifurcate, but with narrower widths and a smaller area. Samples 2 and 3 exhibit cracks without bifurcation, with shorter lengths, thereby demonstrating higher stability compared to Sample 1. It can be concluded that the physical characteristics analysis of the cracks in
Figure 12 is largely consistent with the statistical results of the parameters in
Table 3.
4.3. Ablation Studies
To validate the effectiveness of the bimodal image fusion module and the brightness piecewise linear enhancement module for crack segmentation in the proposed CSA-BB algorithm, this section conducts ablation experiments on these two modules separately. Comparative analysis of algorithm efficiency is also performed.
4.3.1. Ablation Study of the Image Fusion Module
To validate the effectiveness of the image fusion module in the proposed CSA-BB, this section evaluates the effectiveness of the image fusion module by comparing the performance of the proposed algorithm with different image fusion strategies. The experiment is designed as follows:
(1) Removal of the proposed bimodal image fusion module in this study: Remove-Fusion Module (R-FM).
(2) Randomly setting three sets of pseudo-color image fusion weights with the original image (α and β): Fusion Weight-1 (FW-1), where α = 0.1, β = 0.9; Fusion Weight-2 (FW-2), where α = 0.3, β = 0.7; Fusion Weight-3 (FW-3), where α = 0.5, β = 0.5.
The comparison results of the ablation experiment on the image fusion module are shown in
Figure 15. When applying the R-FM method to process Samples 1, 3, and 4, the results show partial missing of crack regions and significant fragmentation. Additionally, the method exhibits significant deviation between the detected crack regions in Sample 2 and the actual cracks. When using three sets of random fusion weights to process Samples 1 and 4, numerous fractures are observed in the cracks. For Samples 2 and 3, severe distortion of the cracks occurs, accompanied by a significant amount of noise. Comparatively, the CSA-BB algorithm proposed in this paper demonstrates significant improvements in crack detection accuracy and background noise suppression, effectively reconstructing the true conditions of cracks.
4.3.2. Ablation Study of the Brightness Piecewise Linear Enhancement Module
This section evaluates the effectiveness of the brightness segmentation linear enhancement module by comparing the performance of the proposed algorithm with different image enhancement methods. The experiment is conducted as follows:
(1) Removal of the brightness piecewise linear enhancement module proposed in this study: Remove-Enhancement Module (R-EM).
(2) Image enhancement using piecewise linear transformation: Piecewise Linear Transformation (PLT).
The comparison results of the ablation experiment on the crack enhancement module are shown in
Figure 16. The R-EM method achieves good segmentation results for Sample 1, but for Samples 2, 3, and 4, it produces significant fragmentation in the detected cracks. When using the PLT method to process Sample 1, significant noise interference is present in the crack images. For Samples 2, 3, and 4, the results show partial loss of crack content, making it difficult to accurately extract crack information. It is evident that both methods demonstrate inferior crack-extraction performance compared to the algorithm proposed in this paper.
4.3.3. Efficiency Analysis of the Proposed Algorithm
To analyze the operational efficiency of the proposed algorithm, this section tabulates the runtime of CSA-BB, Remove-Fusion and Enhancement Module (R-FEM), R-FM, and R-EM, as well as the average runtime per sample (
Tave) in
Table 4. It presents the data from
Table 4 as a line graph in
Figure 17.
From
Table 4 and
Figure 17, it can be seen that the polyline representing the proposed algorithm almost overlaps with that of R-EM, and the polyline for R-FEM nearly overlaps with that of R-FM. Moreover, the two pairs of methods differ only in whether the brightness segmentation linear enhancement module is included, indicating that the time cost of this module is negligible. The
Tave of the proposed algorithm is only 0.67 s longer than that of R-FM. This indicates that the proposed algorithm has a minimal impact on efficiency while enhancing complexity.