Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation

Li, Yong; Ji, Nian; Zhao, Fuzhe; Zhang, Huaiwen; Liu, Zeqi; Rai, Laxmisha; Deng, Zhaopeng

doi:10.3390/math14071235

Open AccessArticle

Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation

by

Yong Li

¹

,

Nian Ji

²,

Fuzhe Zhao

²,

Huaiwen Zhang

¹,

Zeqi Liu

²,

Laxmisha Rai

³

and

Zhaopeng Deng

^4,*

¹

CNPC Engineering Technology R&D Company Limited, Beijing 102206, China

²

School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China

³

School of Electronic Information Engineering, Shandong University of Science and Technology, Qingdao 266590, China

⁴

School of Information Management, Qingdao University of Technology, Qingdao 266520, China

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(7), 1235; https://doi.org/10.3390/math14071235

Submission received: 5 February 2026 / Revised: 27 March 2026 / Accepted: 3 April 2026 / Published: 7 April 2026

Download

Browse Figures

Versions Notes

Abstract

Accurate segmentation of structural cracks is a core prerequisite for quantifying crack parameters, assessing damage severity, and providing early warning of structural safety. However, different types of structures exhibit significant individual variations in features such as color, texture, and brightness. Consequently, commonly used image segmentation algorithms struggle to establish a universal mathematical model, making it challenging to robustly identify and precisely segment crack targets amidst multi-feature disparities. To address the issue, this paper proposes a crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement (CSA-BB), and further enables parameter extraction and crack monitoring. The algorithm utilizes the complementary properties of visible-light and pseudo-color images for bimodal image fusion, thereby enhancing the detailed features of cracks. Furthermore, a brightness piecewise linear function has been devised that automatically selects appropriate parameters for image enhancement of structural cracks across varying background brightness. Subsequently, the crack region is effectively segmented using the bottom-hat transform and the OTSU algorithm. Ultimately, the crack’s safety level is determined from the acquired crack parameters, thereby enabling effective monitoring and assessment of the crack development process. In this paper, the proposed method achieves the best segmentation performance with a Dice coefficient of 0.4511 and a Jaccard index of 0.2981. Compared to the second-best algorithm, it yields significant improvements of 26.9% and 34.5%, respectively, demonstrating higher consistency with the ground truth. Moreover, superior computational efficiency and robustness are achieved, fulfilling the operational demands of real-world engineering environments.

Keywords:

structural crack; crack segmentation; image fusion; image brightness; crack monitoring; machine vision

MSC:

68T45

1. Introduction

Structural cracks are typically caused by the stress within a structure reaching its limit, resulting in insufficient load-bearing capacity. These cracks may be small initially, but over time, they can expand, compromising structural integrity. Ultimately, these cracks may lead to localized or overall instability, potentially even causing the collapse of the structure. By regularly monitoring crack development [1], potential structural issues can be promptly identified, preventing potential instability or collapse. Therefore, it is necessary to conduct further analysis of these cracks to ensure the safety and stability of the structural framework.

Traditional crack detection methods primarily rely on manual operations. People utilize crack width gauges, ultrasonic detection devices, physical rulers, and measuring tapes to directly measure the condition of cracks on the structure’s surface at the engineering site. However, manual measurement is subjective, not only involving high costs and relatively low operational efficiency but also making it difficult to ensure the objectivity and accuracy of the results. In recent years, with the rapid advancement of computer science and technology, machine vision has demonstrated tremendous potential and advantages for crack detection and analysis. This technology enables high-precision segmentation of structural images, accurately extracts crack morphological features, and provides a solid data foundation for quantitative analysis of cracks.

Existing image processing methods for structural crack detection generally fall into two categories: Digital Image Processing (DIP) [2] and deep-learning-based large models [3,4,5,6]. While DIP techniques—such as global/adaptive thresholding, edge detection (e.g., Sobel, Canny), morphological operations, region growing, and histogram equalization—are widely used, they face significant challenges in real-world engineering environments. Crack images often suffer from complex background textures, uneven illumination, dynamic grayscale variations, and low foreground–background contrast. Specifically, global thresholding is overly sensitive to illumination, failing when grayscale distributions overlap; edge detection is prone to noise-induced boundary discontinuities; and morphological operations rely on rigid, manually defined structuring elements that cannot adapt to irregular crack widths. Additionally, contrast enhancement techniques, such as histogram equalization, often degrade segmentation accuracy by amplifying background noise alongside crack features.

Deep-learning-based methods rely on large-scale annotated datasets to automatically learn hierarchical feature representations and demonstrate strong robustness to complex backgrounds and illumination variations, achieving significant success in crack monitoring. However, the performance of these methods heavily depends on the availability of large and accurately labeled datasets [7,8]. The collection and annotation of crack data are time-consuming and labor-intensive, and some types of crack samples are relatively scarce. In addition, significant variations in the morphology and characteristic parameters of the structural cracks further complicate model generalization.

Even with sufficient training data, deep-learning models often require complex network architectures, substantial computational resources, and long training times. In contrast, the method proposed in this paper does not rely on model training. Therefore, it offers advantages in terms of computational efficiency, interpretability, and ease of engineering deployment, making it more suitable for practical engineering scenarios, especially in resource-constrained environments.

After reviewing the current status of structural crack databases, it becomes evident that there is a significant shortage of database resources specifically targeting certain types of structures, such as tunnel cracks or geological borehole cracks. This situation hampers the effective development and application of deep-learning models in this field. Moreover, existing image enhancement and segmentation techniques generally lack effective adaptive parameter-tuning mechanisms when faced with diverse crack images, making it difficult to accurately adapt to their complex, varied characteristics. Therefore, this study proposes a novel crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement, and further extends it to dynamic monitoring of crack development. In summary, the contributions of this work can be outlined as follows:

This paper proposes a CSA-BB module for crack segmentation based on bimodal image fusion and brightness piecewise linear enhancement. The algorithm enhances the detailed information of the crack region using a pseudo-color image. It adaptively selects appropriate enhancement strategies based on the image’s brightness level, thereby achieving effective crack segmentation.
An adaptive brightness piecewise linear function is designed to address the significant grayscale variations caused by diverse illumination intensities and complex backgrounds. This module dynamically extracts the global brightness value of the input image. It adaptively selects the optimal adjustment parameters for the piecewise linear function, effectively highlighting the target crack regions against severe environmental interference.
A complete automated monitoring framework is established by extracting three key physical parameters of the segmented cracks. This provides critical quantitative support for safety assessment and decision-making regarding reinforcement of engineering structures. Furthermore, extensive comparative and ablation experiments on diverse crack datasets comprehensively validate the superior segmentation performance of the proposed CSA-BB algorithm and the effectiveness of its core modules.

The remaining sections of this paper are organized as follows. Section 2 provides a brief overview of related work in crack analysis. Section 3 introduces the crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement, along with crack development monitoring. Section 4 analyzes the experimental results. Finally, Section 5 summarizes the work presented in this paper.

2. Related Work

Many scholars began researching structural cracks using image processing technology as early as the late 1980s. Early methods for crack analysis included image denoising, image enhancement, and image segmentation. With the continuous development of deep learning, Convolutional Neural Networks (CNNs) have gradually been applied to crack monitoring and classification tasks. This section reviews relevant research on crack analysis based on machine vision.

Image enhancement is an important research direction in image processing. By enhancing the crack region in an image, subsequent crack detection and analysis can be performed more accurately. As early as 2014, Dou et al. [9] proposed a crack enhancement method based on feature structure coherence histogram equalization. This method enhances the visibility of cracks by adjusting the image’s grayscale distribution, thereby improving overall contrast. Subsequently, Ma et al. [10] improved the traditional histogram equalization method and proposed a different color-space fusion algorithm based on Contrast Limited Adaptive Histogram Equalization (CLAHE). This algorithm enhances local contrast while preserving global contrast, thereby reducing noise introduced by excessive enhancement. As research continued to advance, Yang et al. [11] proposed an enhancement method for asphalt pavement crack images using guided filtering and Retinex. They employed a two-dimensional discrete wavelet transform for image denoising and compression, and processed the wavelet’s low-frequency coefficients using a combination of guided filtering and Multi-Scale Retinex with Color Restoration (MSRCR). These methods mitigate noise to some extent, but due to their lack of adaptability, they struggle to automatically adjust processing parameters across different types of structural crack images.

In image segmentation, early methods included threshold-based segmentation [12,13,14,15,16], edge-detection-based segmentation [12,17,18,19,20], and region-growing-based segmentation [21]. Although these methods are simple and effective, their performance is limited when applied to crack images with significant noise. To address the noise problem, Chen et al. [22] proposed a method for detecting potential crack regions based on multi-threshold image processing. This algorithm combines global and local thresholding to segment the image, removing noise and small regions to extract potential crack areas. Xue et al. [23] proposed a dynamic segmentation Gaussian crack detection algorithm for tunnel lining crack segmentation based on the distribution of projection curves. This algorithm introduces a novel Dynamic Segmentation Gaussian (DSG) model, which incorporates threshold factors corresponding to different crack widths and multi-scale Gaussian factors into the Gaussian model to achieve crack segmentation. It effectively mitigates the impact of uneven illumination on crack detection. These algorithms perform well in their specific application scenarios but may struggle with diverse, complex images.

Parameter extraction is an essential step in crack monitoring systems. Li et al. [24] proposed a method for extracting crack image features based on mathematical morphology and connected-domain thresholding. Through operations such as morphological filtering, skeletonization, and region labeling, the geometric features of cracks are extracted. Tang et al. [25] proposed a novel crack-skeleton refinement algorithm and a width-measurement scheme based on backbone dual-scale features. This algorithm simplifies the redundant data in crack images and improves the efficiency of crack shape estimation. Peng et al. [26] proposed a method for identifying bridge cracks and quantifying their widths using drones and a hybrid feature learning approach. Using corresponding distance data, the actual width of the bridge cracks was measured and quantified using a distance-measurement method. This approach effectively calculates the width of bridge cracks.

Traditional crack-detection methods exhibit both advantages and limitations. Grounded in explicit mathematical models and classical image processing theories, they offer clear algorithmic structures, easy implementation, and low computational costs, providing strong interpretability and engineering practicality. Using techniques like contrast enhancement, thresholding, edge detection, and morphological analysis, these methods achieve effective crack detection and feature extraction in environments with simple backgrounds and uniform lighting. Their strengths primarily lie in low cost, rapid deployment, and applicability to small-sample scenarios. Nevertheless, their reliance on handcrafted features and empirical parameters makes them highly susceptible to illumination changes, noise, and complex backgrounds, leading to frequent false positives and false negatives. Additionally, the step-by-step processing pipeline can lead to error accumulation, limiting its generalization across diverse, large-scale datasets. Ultimately, while traditional methods hold significant value in early engineering deployments, they fall short in complex environmental adaptability and intelligent automation.

With the continuous advancement of deep learning, Convolutional Neural Networks (CNNs) and their derivatives have driven significant progress in crack image segmentation. Recently, various models have shown robust segmentation performance on public datasets, namely Crack500 [27], DeepCrack [28], and CFD [29], as compared in Table 1. For example, GAF-Net [30] enhances multi-scale feature extraction to achieve a high recall rate for complex crack morphologies, reaching an F1-score of 0.931 on the DeepCrack [28] dataset, which indicates strong crack-capturing capabilities. However, its complex structure and high computational overhead limit its effectiveness in low-contrast environments. Meanwhile, CrackDiff [31] utilizes diffusion modeling for fine-grained crack reconstruction, achieving high IoU scores across multiple datasets, though at the cost of slow inference and high training complexity. PAFNet [32] employs a lightweight architecture that ensures stability in cross-dataset evaluations, yet it lacks robustness against illumination changes and struggles with boundary refinement. Conversely, CrackW-Net [33] offers a simpler structure with fewer parameters, but yields lower overall segmentation accuracy and struggles with complex backgrounds.

Despite these accuracy improvements, the performance of deep-learning methods remains highly dependent on massive, high-quality annotated datasets. Data for specific crack types, such as tunnel or unique structural cracks, is often scarce. Additionally, real-world engineering images are frequently plagued by uneven lighting, complex backgrounds, and significant distribution shifts. Ultimately, these challenges degrade model generalization and training efficacy, hindering the widespread deployment of deep-learning models in practical engineering applications.

To address the issue of crack analysis in different structures under complex backgrounds, this paper proposes a novel structural crack-segmentation algorithm. This algorithm can adjust enhancement parameters based on image brightness and reduce the impact of lighting variations on processing outcomes through image fusion, thereby enabling effective crack analysis and monitoring. Through this algorithm, post-earthquake structural damage can be accurately identified and assessed, aiding in the rapid formulation of repair and reinforcement plans. This effectively prevents secondary disasters, thereby ensuring the safety and stability of engineering structures after an earthquake.

3. Methodology

The crack development monitoring system presented in this paper is primarily implemented through two steps: crack segmentation and development monitoring. In the crack image segmentation phase, this paper proposes a crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement (CSA-BB) to achieve precise segmentation of the crack region. In the crack development monitoring phase, this paper measures various crack parameters. It determines the safety level of cracks using predefined evaluation criteria, enabling effective monitoring and assessment of crack development.

3.1. Crack Segmentation

3.1.1. Overview of the CSA-BB Algorithm

The flowchart of the crack-segmentation algorithm proposed in this paper is shown in Figure 1 and consists of three steps: bimodal image fusion, brightness piecewise linear enhancement, and crack region segmentation. Firstly, this paper employs the Jet colormap method to generate a pseudo-color image from the original image, and uses a fusion function constructed in this study to merge the two images, resulting in a fused image. Then, this paper calculates the global brightness of the original image based on the brightness values of its red, green, and blue channels, and a brightness piecewise linear function is constructed using this global brightness to perform image enhancement. Finally, this paper enhances the crack region using the bottom-hat [34] transform and achieves accurate segmentation of crack areas through OTSU binarization and morphological closing.

3.1.2. Bimodal Image Fusion

Multimodal image processing refers to the process of analyzing, fusing, or processing image data obtained from different sensors or modalities. In this process, image data from different modalities are regarded as complementary information sources, which are comprehensively utilized to enhance the performance of image analysis and understanding. This paper draws on the idea of multimodal image fusion, utilizing the fusion of pseudo-color images with the original images. Pseudo-color images typically have high color contrast and vivid colors. This diversity makes the images visually richer, allowing more detailed features to be presented. Therefore, this fusion method preserves the original information carried by the original image while incorporating the color information from the pseudo-color image. This highlights the crack region in the image, resulting in better crack segmentation performance.

The process of the bimodal image fusion strategy proposed in this paper is shown in Figure 2 and consists of three main steps as follows:

Generating a pseudo-color image: This paper converts the original image to grayscale and produces a pseudo-color image using the Jet colormap.
Calculating the Bhattacharyya Coefficient (BC): This paper obtains the grayscale histograms of the pseudo-color image and the original image, and then computes the BC based on these histograms.
Constructing the bimodal fusion image: This paper constructs a fusion function based on the BC and utilizes both the original image and the pseudo-color image to generate the fused image.

Figure 2. Flowchart of the bimodal image fusion method.

(1) Generating the pseudo-color image

The paper obtains the pseudo-color image of the crack grayscale image using the Jet colormap technique. This mapping method is based on variations in hue, mapping grayscale levels to different colors. Consequently, different grayscale values show obvious color differences in the pseudo-color image. The calculation formulas, derived from the mathematical implementation of MATLAB’s built-in jet colormap function [35], are shown in Equations (1)–(3):

\begin{array}{l} r (x) = \{\begin{cases} 0 0 \leq x \leq \frac{3}{8} \\ 4 x - \frac{3}{2} \frac{3}{8} \leq x \leq \frac{5}{8} \\ 1 \frac{5}{8} \leq x \leq \frac{7}{8} \\ \frac{9}{2} - 4 x \frac{7}{8} \leq x \leq 1 \end{cases} \end{array}

(1)

g (x) = \{\begin{cases} 0 0 \leq x \leq \frac{1}{8} \\ 4 x - \frac{1}{2} \frac{1}{8} \leq x \leq \frac{3}{8} \\ 1 \frac{3}{8} \leq x \leq \frac{5}{8} \\ \frac{7}{2} - 4 x \frac{5}{8} \leq x \leq \frac{7}{8} \\ 0 \frac{7}{8} \leq x \leq 1 \end{cases}

(2)

b (x) = \{\begin{cases} 4 x + \frac{1}{2} 0 \leq x \leq \frac{1}{8} \\ 1 \frac{1}{8} \leq x \leq \frac{3}{8} \\ \frac{5}{2} - 4 x \frac{3}{8} \leq x \leq \frac{5}{8} \\ 0 \frac{5}{8} \leq x \leq 1 \end{cases}

(3)

where x denotes the input grayscale value, and r(x), g(x), and b(x) represent the color values of the red, green, and blue channels, respectively. These three functions define how different grayscale levels are assigned values in the red, green, and blue channels, respectively. Once each channel is assigned specific values, the Jet colormap is achieved. The effects are shown in Figure 3b.

(2) Calculating the BC

Crack images typically contain a certain amount of noise interference, which can affect the accuracy of image similarity [11] measurements. The BC, by statistically analyzing the overall distribution of image color and grayscale values, can effectively reduce the impact of noise on image similarity. Therefore, this paper uses the BC of the image histograms as a measure of the similarity between two images. The BC ranges from 0 to 1, where 0 indicates that the two images are completely dissimilar, and 1 indicates that the two images are completely identical. The calculation formula for the BC is shown in Equation (4):

B C = \sum_{i = 0}^{255} \sqrt{p_{i} q_{i}}

(4)

where I represents the grayscale level of each pixel in the image, and p_i and q_i denote the probability of the i-th grayscale level occurring in the histograms of the two images, respectively. BC refers to the Bhattacharyya coefficient.

(3) Constructing the bimodal fusion image

To achieve the fusion of the pseudo-color image with the original image, this paper constructs a fusion function. The goal of this function is to highlight the detailed features of the crack while preserving the original information of the image using appropriate fusion weights. Determining the fusion weights through the BC helps balance the information from the pseudo-color image and the original image, ensuring that the fusion result retains both the enhanced features from the pseudo-color image and the true details from the original image. The bimodal fusion function constructed in this paper is shown in Equation (5):

F (x, y) = α \cdot A (x, y) + β \cdot B (x, y) = \frac{B C_{1}}{B C_{1} + B C_{2}} \cdot A (x, y) + \frac{B C_{2}}{B C_{1} + B C_{2}} \cdot B (x, y)

(5)

where A (x, y) and B (x, y) represent the pixel values of the pseudo-color image and the original image at the coordinates (x, y), respectively. A and β are their respective fusion weights. BC₁ is the Bhattacharyya coefficient between the pseudo-color image and the original image, and BC₂ is the Bhattacharyya coefficient between the two original images. F (x, y) denotes the pixel value of the fused image at the coordinates (x, y). If BC₁ < BC₂, it indicates that the original image contains more original information. By using BC₁ as the numerator of the fusion weight for the pseudo-color image to assign it a lower weight, and BC₂ as the numerator of the fusion weight for the original image to assign it a higher weight, this information can be preserved in the fused image. The results are shown in Figure 4. It can be observed that the bimodal image fusion strategy enhances the visual effect of the image and highlights more details and features of the crack.

3.1.3. Brightness Piecewise Linear Enhancement

Due to various differences during the acquisition of crack images, such as illumination intensity, background color, and image resolution, these differences result in significant grayscale magnitude variations in the grayscale results of different fused images. Using fixed-parameter image enhancement techniques to enhance the crack region yields poor results. To address this issue, this paper proposes a brightness-based piecewise linear function that adaptively selects adjustment parameters based on the image’s brightness, thereby enhancing the image to highlight the crack region. The function is shown in Equation (6):

g (x, y) = \{\begin{cases} \frac{10}{α T + β} f (x, y) 0 \leq f (x, y) \leq α T + β \\ \frac{γ T + δ - 10}{200 - (α T + β)} [f (x, y) - (α T + β)] + 10 α T + β \leq f (x, y) \leq 200 \\ \frac{255 - (γ T + δ)}{55} [f (x, y) - 200] + γ T + δ 200 < f (x, y) \leq 255 \end{cases}

(6)

where f (x, y) represents the original grayscale value of the pixel at coordinates (x, y) in the image, g (x, y) represents the transformed grayscale value of f (x, y), T represents the global brightness of the image, which is the average of the brightness values of all pixels in the image, reflecting the overall brightness of the image. The global brightness T of the image can be obtained according to Equation (7):

T = \frac{1}{W \times H} [\sum_{i = 1}^{n} R (x, y) + \sum_{i = 1}^{n} G (x, y) + \sum_{i = 1}^{n} B (x, y)]

(7)

where W and H are the width and height of the original image, n is the total number of pixels in the image, and R (x, y), G (x, y), and B (x, y) represent the brightness values of the red, green, and blue channels at the point (x, y) in the image, respectively. The function adaptively selects adjustment parameters based on the brightness T. In this paper, a threshold Br is set. If T < Br, the parameters [α; β; γ; δ] = [1.2; −112; −0.6; 331]; If T > Br, the parameters [α; β; γ; δ] = [0.2; 90; −1.2; 394]. The polyline graph of the brightness piecewise linear function is shown in Figure 5.

The fused images are converted to grayscale, resulting in the grayscale images shown in Figure 6b. Then, the grayscale images are enhanced using the brightness-enhancing piecewise linear function proposed in this study, as shown in Figure 6c. According to the comparison, despite varying illumination intensities and noise interference in the original crack images, the brightness piecewise linear function enhances the crack details and highlights their contrast.

3.1.4. Crack Region Segmentation

To separate the crack region from the background in the images and further obtain statistical information and features about the crack, this paper performs a crack segmentation operation. This section mainly includes three steps: bottom-hat transformation, OTSU algorithm, and morphological closing operation. The bottom-hat transformation effectively extracts small features in the image that are darker than the background in the target area. The OTSU algorithm can automatically calculate an optimal threshold to effectively separate the crack from the background in an image, thus avoiding the subjectivity and uncertainty associated with manual threshold selection. Finally, this paper employs a morphological closing operation to connect intermittent cracks and smooth their boundaries.

The principle of the bottom-hat transformation involves subtracting the original image from the result of the closing operation, thereby obtaining the valley portions filled by the closing operation. In the crack image, these valley portions typically correspond to the darker crack region, known as the “black bottom-hat.” The definition of the bottom-hat transformation is shown in Equation (8):

I^{"} = (I * B) - I

(8)

where I represents the original image, and B is the structuring element, which is a small, predefined binary matrix. Specifically, in this study, a disk-shaped structuring element with a radius of 5 pixels is employed as B to effectively enhance the dark crack features. I*B denotes the closing operation, I” is the image after the bottom-hat transformation. The results of the bottom-hat transformation are shown in Figure 7b. As depicted in the figures, the bottom-hat transformation effectively suppresses the background regions. It enhances the details of darker parts of images, making features such as crack morphology, orientation, and distribution patterns clearer and more visible.

The process of binarizing an image using the OTSU algorithm primarily involves finding a threshold T that maximizes the inter-class variance. By maximizing the inter-class variance, this study achieves optimal segmentation of the crack. The formula for inter-class variance is shown as Equation (9):

σ_{a b}^{2} (T) = \frac{N_{a} \cdot N_{b} \cdot {(μ_{a} - μ_{b})}^{2}}{N^{2}}

(9)

where N_a is the number of foreground pixels, N_b is the number of background pixels, μ_a is the average grayscale value of the foreground, μ_b is the average grayscale value of the background, N represents the total number of pixels in the image, and σ²(T) represents the inter-class variance. By iterating through all possible thresholds T according to the calculation formula for inter-class variance, the threshold that maximizes the inter-class variance is identified as the optimal threshold. The results of image binarization are shown in Figure 8b. It can be seen that the OTSU algorithm effectively separates the cracks and background regions while suppressing noise interference.

In morphological processing, the closing operation can be described as performing a dilation followed by an erosion on the input image. This process fills small holes within objects in the image. In this paper, the closing operation is used to connect the broken parts within the fracture region and smooth the object boundaries. The expression for the closing operation is shown in Equation (10).

I^{ϕ} (x) = ε_{B} [δ_{B} (I (x))]

(10)

where I^φ(x) represents the image after the closing operation, I(x) represents the original binarized image, δ_B denotes the erosion operation, and ε_B denotes the dilation operation. The results of the closing operation are shown in Figure 9b. It can be observed that the closing operation fills the gaps within the cracks, making them more continuous and complete, thereby achieving effective segmentation of the crack regions.

3.2. Monitoring Crack Development

By analyzing crack parameters, the structural integrity risk is predicted, enabling systematic monitoring of cracks. Monitoring data can provide a scientific basis for engineering management and maintenance decisions, helping to formulate rational maintenance plans to ensure the stability and safety of structures. In this section, crack parameters are obtained through two steps: crack region labeling and crack parameter extraction.

(1) Crack region labeling

In this study, the eight-connected region labeling method is employed to extract multiple crack regions from a segmented image. This method visualizes connected regions, allowing for clearer observation of the distribution, morphology, and topological structure of cracks. It helps identify potential structural features and regular patterns. In this paper, a threshold T (set to 15) is used to eliminate small interference regions with pixel counts below T, as indicated by the blue regions in Figure 10b. The connected region labeling method can assign a unique label to each crack, enabling the counting and localization of cracks, and facilitating further quantitative analysis.

(2) Crack parameter extraction

In the crack evaluation system, different crack parameters represent various properties of cracks. Many previous studies have evaluated crack development by introducing geometric parameters such as crack area, length, and width. However, due to differences in the imaging field of view during image acquisition, the scale of cracks may vary among images. In addition, variations in shooting distance and angle may further cause differences in the size and scale of each acquired crack image, making it difficult to objectively evaluate the influence of crack area on the entire structure. To reduce the influence of these factors, this study adopts ratio-based indicators to describe crack characteristics. Specifically, the crack area reflects the scale of the crack, the crack length indicates the extent of crack propagation, and the crack width represents the degree of crack opening. Therefore, this study assesses the damage degree of the structure and determines its safety level by calculating three parameters: crack area ratio, crack length ratio, and crack width ratio.

(1) Crack area ratio

The crack area is a paramount indicator of structural deterioration, as it reflects a reduction in the effective cross-sectional area and bearing capacity. Generally, a larger crack area signifies more severe internal damage. Therefore, this study assesses the degree of structural damage by calculating the crack area ratio. The crack area ratio can be computed using Formula (11):

A = \frac{\sum_{i = 1}^{n} m_{i}}{W \times H} \times 100 %

(11)

where n represents the total number of independent wall crack regions detected in the image; this value is automatically determined by counting the isolated connected components in the binary segmentation map rather than being manually selected. M represents the number of pixels in each crack region, and W and H denote the width and height of the original image, respectively.

(2) Crack length ratio

Crack length is a critical metric in crack detection, often associated with stress concentration zones. Long cracks typically occur at stress concentration points, such as structural joints or areas of concentrated load. In this study, morphological operations are used to iteratively refine the binary image, gradually removing edge pixels of cracks until obtaining a one-pixel-wide crack skeleton, as shown in Figure 11. As expressed in Equation (12),

L

provides a scale-invariant metric by calculating the ratio of the total skeleton pixel length to the maximum diagonal length of the image.

L = \frac{\sum_{i = 1}^{n} q_{i}}{\sqrt{W^{2} + H^{2}}} \times 100 %

(12)

where n represents the number of skeletons, and q_i represents the number of pixels in the i-th skeleton.

(3) Crack width ratio

By measuring and recording crack widths, the structural safety is assessed, and appropriate repair and reinforcement measures are implemented. In this study, the average crack width is calculated using the obtained crack area and length. The formula for the crack width ratio is shown as Equation (13):

W = \frac{A}{\sum_{i = 1}^{n} q_{i} \times \sqrt{W^{2} + H^{2}}} = \frac{\sum_{i = 1}^{n} m_{i}}{\sum_{i = 1}^{n} q_{i} \times W \times H \times \sqrt{W^{2} + H^{2}}} \times 100 %

(13)

4. Experiments

To verify the effectiveness of the proposed CSA-BB algorithm, this section applies it to a total of 50 sample images selected as experimental data from four widely used datasets: CFD [29], DeepCrack [28], Cracktree200 [36], and CRACK500 [27]. These datasets collectively cover a wide range of diverse crack morphologies, varying illumination conditions, and complex backgrounds. Subsequently, this section sets up comparative experiments to compare the segmentation performance of this algorithm with four other segmentation algorithms. Additionally, this section includes modular ablation experiments to verify the effectiveness and efficiency of the bimodality image fusion and brightness piecewise linear enhancement modules.

All experiments are conducted using MATLAB R2024a on a Windows 10 platform, powered by an Intel Core i7-11800H processor and an NVIDIA GeForce RTX 3060 GPU with 6 GB of VRAM, supported by 16 GB of system memory.

4.1. Comparison of Crack Segmentation Results

In this section, four different types of structural crack image samples are selected as experimental subjects to evaluate the segmentation performance of the proposed CSA-BB algorithm. Simultaneously, the segmentation performance of the proposed algorithm is compared with that of four other crack-segmentation algorithms, as shown in Figure 12.

As shown in Figure 12, the performance of different conventional methods varies significantly in the task of crack segmentation under complex background conditions. The OTSU thresholding method achieves relatively satisfactory results when the background is homogeneous and noise interference is limited, as illustrated in Sample 3 (c) and Sample 4 (c), where the crack regions can be largely separated. However, when severe grayscale overlap exists between cracks and the background, accompanied by strong local noise, its segmentation capability deteriorates considerably. In Sample 1 (c) and Sample 2 (c), problems such as crack discontinuity, adhesion to the background, and false segmentation are observed, making reliable crack extraction difficult.

Although the Canny edge detection algorithm exhibits a certain degree of noise suppression, it tends to produce unstable edge responses. In particular, blurred and fragmented crack boundaries can be observed in Sample 2 (d) and Sample 3 (d), preventing the formation of complete and precise crack contours and thus limiting subsequent geometric and morphological parameter analysis. The segmentation results obtained using the Laplacian of Gaussian (LoG) operator for all four samples are characterized by dense noise responses and poor discrimination between cracks and background, leading to unsatisfactory segmentation quality. Similarly, the Prewitt operator shows deficiencies in detail preservation; in Sample 4 (f), fine crack branches and local structural features are noticeably lost, compromising the integrity of the crack representation.

In contrast, the proposed method demonstrates consistently stable and superior segmentation performance across all four samples. Regardless of whether the background is uniform or characterized by severe grayscale overlap and strong noise interference, the proposed approach effectively suppresses background disturbances while preserving clear and continuous crack boundaries. As shown in Sample 1 (g) to Sample 4 (g), both the main crack structures and fine details are well maintained, with significantly reduced false detections. These results indicate that the proposed method outperforms OTSU, Canny, LoG, and Prewitt in terms of noise robustness and adaptability to complex scenes, enabling more reliable crack segmentation and providing a solid foundation for subsequent crack parameter extraction and evolution monitoring.

As shown in Table 2, the proposed method achieves the best overall performance, with an average Dice coefficient of 0.4511 and an average Jaccard index of 0.2981, which are clearly higher than those of Prewitt (0.2119/0.1258), OTSU (0.0894/0.0470), Canny (0.3556/0.2217), and LoG (0.2921/0.1745). These results indicate that the proposed method has a more pronounced advantage in terms of crack-region extraction accuracy and overlap with the manual annotations.

A further analysis of individual samples shows that the proposed method performs particularly well on Sample 1 and Sample 2. For instance, on Sample 1, the proposed method achieves a Dice score of 0.4834, which is substantially higher than those of LoG (0.3281) and Canny (0.1675). On Sample 2, its Dice score reaches 0.5964, exceeding that of the second-best method, Canny (0.3993), by nearly 0.20. This demonstrates that the proposed method is capable of preserving the main crack structure and effectively suppressing background interference, even in the presence of relatively complex backgrounds and thin, elongated cracks. For Sample 3, however, the Dice score of the proposed method is 0.2794, which is lower than that of LoG (0.4030) and Canny (0.3836), suggesting that its response to locally weak or fine crack details is somewhat limited in this case. For the last sample, Canny achieves the best result, with a Dice score of 0.4717, slightly higher than that of the proposed method (0.4450), indicating that traditional edge-detection methods may still retain certain advantages when crack boundaries are clear and background interference is relatively low.

In summary, the proposed method not only yields the best average performance but also demonstrates better robustness and consistency across most individual samples. Although it is outperformed by Canny or LoG on a few specific cases, its overall segmentation performance and practical effectiveness indicate that it is more suitable for crack segmentation tasks under complex background conditions.

As illustrated in Figure 13, the proposed CSA-BB algorithm demonstrates highly robust segmentation performance across a variety of challenging conditions. Row (a) presents seven distinct original structural crack images, deliberately selected to showcase significant individual variations in color, texture, and background brightness, including uneven illumination and complex surface materials. Row (b) displays the corresponding binary segmentation masks generated by the proposed algorithm. It is visually evident that despite the multi-feature disparities and complex backgrounds, the CSA-BB algorithm successfully overcomes these interferences to precisely segment the crack targets. The clear and continuous extraction of the crack networks confirms that the bimodal image fusion and brightness piecewise linear enhancement modules effectively adapt to varying environmental conditions, fulfilling the demand for robust identification in real-world engineering scenarios.

As illustrated in Figure 14, the proposed CSA-BB algorithm demonstrates stable crack segmentation performance under noise interference. Row (a) displays the original crack samples, while Row (b) shows the same images corrupted by salt-and-pepper noise. This noise introduces grayscale overlap between the cracks and the background, a condition that typically causes false segmentation in conventional methods. However, as shown in Sample 1 (c) through Sample 4 (c), the proposed approach effectively suppresses background noise while preserving continuous crack boundaries. The algorithm extracts both the main crack structures and local branches without producing fragmented contours or isolating noise artifacts. These results indicate that the bimodal image fusion and enhancement modules preserve target integrity, providing a reliable basis for subsequent geometric parameter extraction.

4.2. Crack Monitoring

To assess the safety and stability of structures based on crack monitoring data, and to develop scientifically reasonable maintenance plans to delay structural aging. This paper has established three levels of crack hazard according to crack parameters [37,38] for crack monitoring. The parameter thresholds can be customized according to specific structural types and engineering inspection requirements.

(1) Safe Crack

A \leq 1 % \cap L \leq 50 \cap W \leq 10^{- 5}

(2) Crack to be Monitored

(A \leq 1 % \cap L \leq 50 \cap W > 10^{- 5}) \cup (A \leq 1 % \cap L > 50 \cap W \leq 10^{- 5}) \cup (A > 1 % \cap L \leq 50 \cap W \leq 10^{- 5})

(3) Hazardous Crack

(A \leq 1 % \cap L > 50 \cap W > 10^{- 5}) \cup (A > 1 % \cap L \leq 50 \cap W > 10^{- 5}) \cup (A > 1 % \cap L > 50 \cap W \leq 10^{- 5}) \cup (A > 1 % \cap L > 50 \cap W > 10^{- 5})

Due to variations in shooting angles and scales among different crack images, it is impossible to standardize the units for measuring crack length and width. Therefore, this paper uses pixel count (pc) as the parameter unit for evaluating crack length and width. Crack parameters and risk levels are shown in Table 3. For Sample 1, the crack area ratio reaches 1.7%, and the normalized crack length ratio is 102%. Since both parameters exceed the established safe thresholds, Sample 1 is classified as a “Hazardous Crack”. Similarly, Samples 2 and 4 partially exceed the safe criteria, meeting the conditions for “Cracks to be Monitored,” and are thus classified as such. Sample 3 satisfies all three criteria for a “Safe Crack,” and is classified accordingly. Seen in conjunction with Figure 12, in Sample 1, cracks bifurcate in the middle section into two branches extending left and right, with a significant width, rendering the entire structure more unstable. In contrast, cracks in Sample 4 also bifurcate, but with narrower widths and a smaller area. Samples 2 and 3 exhibit cracks without bifurcation, with shorter lengths, thereby demonstrating higher stability compared to Sample 1. It can be concluded that the physical characteristics analysis of the cracks in Figure 12 is largely consistent with the statistical results of the parameters in Table 3.

4.3. Ablation Studies

To validate the effectiveness of the bimodal image fusion module and the brightness piecewise linear enhancement module for crack segmentation in the proposed CSA-BB algorithm, this section conducts ablation experiments on these two modules separately. Comparative analysis of algorithm efficiency is also performed.

4.3.1. Ablation Study of the Image Fusion Module

To validate the effectiveness of the image fusion module in the proposed CSA-BB, this section evaluates the effectiveness of the image fusion module by comparing the performance of the proposed algorithm with different image fusion strategies. The experiment is designed as follows:

(1) Removal of the proposed bimodal image fusion module in this study: Remove-Fusion Module (R-FM).

(2) Randomly setting three sets of pseudo-color image fusion weights with the original image (α and β): Fusion Weight-1 (FW-1), where α = 0.1, β = 0.9; Fusion Weight-2 (FW-2), where α = 0.3, β = 0.7; Fusion Weight-3 (FW-3), where α = 0.5, β = 0.5.

The comparison results of the ablation experiment on the image fusion module are shown in Figure 15. When applying the R-FM method to process Samples 1, 3, and 4, the results show partial missing of crack regions and significant fragmentation. Additionally, the method exhibits significant deviation between the detected crack regions in Sample 2 and the actual cracks. When using three sets of random fusion weights to process Samples 1 and 4, numerous fractures are observed in the cracks. For Samples 2 and 3, severe distortion of the cracks occurs, accompanied by a significant amount of noise. Comparatively, the CSA-BB algorithm proposed in this paper demonstrates significant improvements in crack detection accuracy and background noise suppression, effectively reconstructing the true conditions of cracks.

4.3.2. Ablation Study of the Brightness Piecewise Linear Enhancement Module

This section evaluates the effectiveness of the brightness segmentation linear enhancement module by comparing the performance of the proposed algorithm with different image enhancement methods. The experiment is conducted as follows:

(1) Removal of the brightness piecewise linear enhancement module proposed in this study: Remove-Enhancement Module (R-EM).

(2) Image enhancement using piecewise linear transformation: Piecewise Linear Transformation (PLT).

The comparison results of the ablation experiment on the crack enhancement module are shown in Figure 16. The R-EM method achieves good segmentation results for Sample 1, but for Samples 2, 3, and 4, it produces significant fragmentation in the detected cracks. When using the PLT method to process Sample 1, significant noise interference is present in the crack images. For Samples 2, 3, and 4, the results show partial loss of crack content, making it difficult to accurately extract crack information. It is evident that both methods demonstrate inferior crack-extraction performance compared to the algorithm proposed in this paper.

4.3.3. Efficiency Analysis of the Proposed Algorithm

To analyze the operational efficiency of the proposed algorithm, this section tabulates the runtime of CSA-BB, Remove-Fusion and Enhancement Module (R-FEM), R-FM, and R-EM, as well as the average runtime per sample (T_ave) in Table 4. It presents the data from Table 4 as a line graph in Figure 17.

From Table 4 and Figure 17, it can be seen that the polyline representing the proposed algorithm almost overlaps with that of R-EM, and the polyline for R-FEM nearly overlaps with that of R-FM. Moreover, the two pairs of methods differ only in whether the brightness segmentation linear enhancement module is included, indicating that the time cost of this module is negligible. The T_ave of the proposed algorithm is only 0.67 s longer than that of R-FM. This indicates that the proposed algorithm has a minimal impact on efficiency while enhancing complexity.

5. Conclusions

Due to factors such as lighting and acquisition conditions, structural cracks often exhibit complex, variable characteristics. This results in existing crack-segmentation algorithms being unable to accurately segment the crack regions. To address this issue, this paper proposes a crack-segmentation algorithm based on bimodal image fusion and brightness piecewise linear enhancement. The algorithm effectively suppresses the interference of uneven illumination intensity by fusing pseudo-color images with the original images, resulting in output images that exhibit richer crack detail features. Secondly, the algorithm constructs a brightness piecewise linear function, which can select adjustment parameters based on the brightness values of the image, achieving adaptive image enhancement. Finally, the algorithm employs the bottom-hat transform and OTSU’s method to achieve precise segmentation of the crack regions. To effectively monitor and assess crack development, this paper extracts multiple parameters of the cracks and compares them with preset danger levels to determine the extent of crack development. Comparative experimental results demonstrate that the proposed algorithm significantly improves crack segmentation performance, outperforming other existing crack-segmentation algorithms. Two sets of ablation experiments confirm that the introduction of each module in this paper positively contributes to enhancing the segmentation effect. In future research, the integration of the proposed image segmentation algorithm with deep-learning models will be explored to enhance applicability and adaptability across diverse scenarios.

Author Contributions

Conceptualization, Y.L.; methodology, N.J.; software, N.J.; validation, L.R.; formal analysis, Z.L.; investigation, F.Z.; resources, H.Z.; data curation, L.R.; writing—original draft preparation, Z.L.; writing—review and editing, Z.D.; visualization, F.Z.; supervision, Z.D.; project administration, Y.L.; funding acquisition, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

National Natural Science Foundation of China (Grant Number: 62001263); Key research projects of Qingdao Science and Technology Plan (Grant Number: 22-3-3-hygg-30-hy); Natural Science Foundation of Shandong Province (Grant Number: ZR2021MF024).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Yong Li and Huaiwen Zhang were employed by the CNPC Engineering Technology R&D Company Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

El Hakea, A.H.; Fakhr, M.W. Recent computer vision applications for pavement distress and condition assessment. Autom. Constr. 2023, 146, 104664. [Google Scholar] [CrossRef]
Chen, P.H.; Hsieh, J.W.; Hsieh, Y.K.; Chang, C.W.; Huang, D.Y. Cross-Scale Overlapping Patch-Based Attention Network for Road Crack Detection. IEEE Trans. Intell. Transp. Syst. 2025, 26, 7587–7599. [Google Scholar] [CrossRef]
Zhuang, H.; Cheng, Y.; Zhou, M.; Yang, Z. Deep learning for surface crack detection in civil engineering: A comprehensive review. Measurement 2025, 248, 116908. [Google Scholar] [CrossRef]
Tang, J.; Li, D.; Yang, J.; Chen, J.; Yuan, R. Leveraging large visual models for enhanced object detection: An improved SAM-YOLOv5 model. Knowl. Based Syst. 2025, 330, 114757. [Google Scholar] [CrossRef]
Alkayem, N.F.; Mayya, A.; Shen, L.; Zhang, X.; Asteris, P.G.; Wang, Q.; Cao, M. Co-CrackSegment: A New Collaborative Deep Learning Framework for Pixel-Level Semantic Segmentation of Concrete Cracks. Mathematics 2024, 12, 3105. [Google Scholar] [CrossRef]
Ju, X.; Zhao, X.; Qian, S. TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection. Mathematics 2022, 10, 2354. [Google Scholar] [CrossRef]
Ma, N.; Fan, R.; Xie, L. UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration. IEEE Trans. Intell. Transp. Syst. 2024, 25, 13926–13936. [Google Scholar] [CrossRef]
Arya, D.; Maeda, H.; Ghosh, S.K.; Toshniwal, D.; Sekimoto, Y. RDD2022: A multi-national image dataset for automatic road damage detection. Geosci. Data J. 2024, 11, 846–862. [Google Scholar] [CrossRef]
Dou, X.-Y.; Han, L.-G.; Wang, E.-L.; Dong, X.-H.; Yang, Q.; Yan, G.-H. A fracture enhancement method based on the histogram equalization of eigenstructure-based coherence. Appl. Geophys. 2014, 11, 179–185. [Google Scholar] [CrossRef]
Suharyanto; Hasibuan, Z.A.; Andono, P.N.; Pujiono, D.; Setiadi, R.I.M. Contrast Limited Adaptive Histogram Equalization for Underwater Image Matching Optimization use SURF. J. Phys. Conf. Ser. 2021, 1803, 012008. [Google Scholar] [CrossRef]
Yang, Z.; Ni, C.; Li, L.; Luo, W.; Qin, Y. Three-Stage Pavement Crack Localization and Segmentation Algorithm Based on Digital Image Processing and Deep Learning Techniques. Sensors 2022, 22, 8459. [Google Scholar] [CrossRef] [PubMed]
Lei, Q.; Zhong, J.; Wang, C. Joint Optimization of Crack Segmentation With an Adaptive Dynamic Threshold Module. IEEE Trans. Intell. Transp. Syst. 2024, 25, 6902–6916. [Google Scholar] [CrossRef]
Wang, W.; Dai, J.; Chen, Z.; Huang, Z.; Li, Z.; Zhu, X.; Hu, X.; Lu, T.; Lu, L.; Li, H.; et al. InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 14408–14419. [Google Scholar]
Tao, H. Weakly-Supervised Pavement Surface Crack Segmentation Based on Dual Separation and Domain Generalization. IEEE Trans. Intell. Transp. Syst. 2024, 25, 19729–19743. [Google Scholar] [CrossRef]
Lang, H.; Yuan, Y.; Chen, J.; Ding, S.; Lu, J.J.; Zhang, Y. Augmented Concrete Crack Segmentation: Learning Complete Representation to Defend Background Interference in Concrete Pavements. IEEE Trans. Instrum. Meas. 2024, 73, 2513413. [Google Scholar] [CrossRef]
Shan, J.; Jiang, W.; Huang, Y.; Yuan, D.; Liu, Y. Unmanned Aerial Vehicle (UAV)-Based Pavement Image Stitching Without Occlusion, Crack Semantic Segmentation, and Quantification. IEEE Trans. Intell. Transp. Syst. 2024, 25, 17038–17053. [Google Scholar] [CrossRef]
Liu, H.; Yang, J.; Miao, X.; Mertz, C.; Kong, H. CrackFormer Network for Pavement Crack Segmentation. IEEE Trans. Intell. Transp. Syst. 2023, 24, 9240–9252. [Google Scholar] [CrossRef]
Zhou, W.; Huang, H.; Zhang, H.; Wang, C. Teaching Segment-Anything-Model Domain-Specific Knowledge for Road Crack Segmentation From On-Board Cameras. IEEE Trans. Intell. Transp. Syst. 2024, 25, 20588–20601. [Google Scholar] [CrossRef]
Guo, J.M.; Markoni, H.; Lee, J.D. BARNet: Boundary Aware Refinement Network for Crack Detection. IEEE Trans. Intell. Transp. Syst. 2022, 23, 7343–7358. [Google Scholar] [CrossRef]
Hu, Z.; Luo, J.; Hong, Z. Category relationship enhancement transformer for industrial defect segmentation. Knowl. Based Syst. 2025, 326, 114059. [Google Scholar] [CrossRef]
Xu, J.; Zhang, C.; Zeng, W. A Conditional Diffusion-Based Crack Segmentation Model: ConditionCrack Segmentation. IEEE Trans. Intell. Transp. Syst. 2025, 26, 17041–17054. [Google Scholar] [CrossRef]
Chen, C.; Seo, H.; Jun, C.; Zhao, Y. A potential crack region method to detect crack using image processing of multiple thresholding. Signal Image Video Process. 2022, 16, 1673–1681. [Google Scholar] [CrossRef]
Xue, D.; Yuan, W. Dynamic Partition Gaussian Crack Detection Algorithm Based on Projection Curve Distribution. Sensors 2020, 20, 3973. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Yin, C.; Zhang, X. Crack Segmentation Extraction and Parameter Calculation of Asphalt Pavement Based on Image Processing. Sensors 2023, 23, 9161. [Google Scholar] [CrossRef] [PubMed]
Tang, Y.; Huang, Z.; Chen, Z.; Chen, M.; Zhou, H.; Zhang, H.; Sun, J. Novel visual crack width measurement based on backbone double-scale features for improved detection automation. Eng. Struct. 2023, 274, 115158. [Google Scholar] [CrossRef]
Peng, X.; Zhong, X.; Zhao, C.; Chen, A.; Zhang, T. A UAV-based machine vision method for bridge crack recognition and width quantification through hybrid feature learning. Constr. Build. Mater. 2021, 299, 123896. [Google Scholar] [CrossRef]
Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection. IEEE Trans. Intell. Transp. Syst. 2020, 21, 1525–1535. [Google Scholar] [CrossRef]
Zou, Q.; Zhang, Z.; Li, Q.; Qi, X.; Wang, Q.; Wang, S. DeepCrack: Learning Hierarchical Convolutional Features for Crack Detection. IEEE Trans. Image Process. 2019, 28, 1498–1512. [Google Scholar] [CrossRef]
Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic Road Crack Detection Using Random Structured Forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
Wen, L.; Ye, Y.; Zuo, L. GAF-Net: A new automated segmentation method based on multiscale feature fusion and feedback module. Pattern Recognit. Lett. 2025, 187, 86–92. [Google Scholar] [CrossRef]
Zhang, H.; Chen, N.; Li, M.; Mao, S. The Crack Diffusion Model: An Innovative Diffusion-Based Method for Pavement Crack Detection. Remote Sens. 2024, 16, 986. [Google Scholar] [CrossRef]
Yang, L.; Huang, H.; Kong, S.; Liu, Y.; Yu, H. PAF-Net: A Progressive and Adaptive Fusion Network for Pavement Crack Segmentation. IEEE Trans. Intell. Transp. Syst. 2023, 24, 12686–12700. [Google Scholar] [CrossRef]
Han, C.; Ma, T.; Huyan, J.; Huang, X.; Zhang, Y. CrackW-Net: A Novel Pavement Crack Image Segmentation Convolutional Neural Network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22135–22144. [Google Scholar] [CrossRef]
Zhou, H.; Shu, D.; Wu, C.; Wang, Q.; Wang, Q. Image Illumination Adaptive Correction Algorithm Based on a Combined Model of Bottom-Hat and Improved Gamma Transformation. Arab. J. Sci. Eng. 2022, 48, 3947–3960. [Google Scholar] [CrossRef]
The MathWorks Inc. MATLAB, version 9.13.0 (R2022b); The MathWorks Inc.: Natick, MA, USA, 2022. [Google Scholar]
Zou, Q.; Cao, Y.; Li, Q.; Mao, Q.; Wang, S. CrackTree: Automatic crack detection from pavement images. Pattern Recognit. Lett. 2012, 33, 227–238. [Google Scholar] [CrossRef]
Cao, Z.; Cao, D.; Chang, H.; Fu, Y.; Shen, X.; Huang, W.; Wang, H.; Bao, W.; Feng, C.; Tong, Z.; et al. Evaluation of Highway Pavement Structural Conditions Based on Measured Crack Morphology by 3D GPR and Finite Element Modeling. Materials 2025, 18, 3336. [Google Scholar] [CrossRef]
Xie, M.; Wang, Z.; Yin, L.e. Research on Concrete Crack Damage Assessment Method Based on Pseudo-Label Semi-Supervised Learning. Buildings 2025, 15, 2726. [Google Scholar] [CrossRef]

Figure 1. Crack-segmentation algorithm flowchart.

Figure 3. Jet pseudo color mapping results. (a) Original images. (b) Pseudo-color images.

Figure 4. Image Fusion Results.

Figure 5. Polyline graph of the brightness piecewise linear function.

Figure 6. Results of brightness piecewise linear enhancement. (a) Fused images. (b) Grayscale images. (c) Enhanced images.

Figure 7. Results of the bottom-hat transformation. (a) Enhanced images. (b) Bottom-hat transformed images.

Figure 8. Results of image binarization. (a) Bottom-hat-transformed images. (b) OTSU-binarized images.

Figure 9. Results of crack segmentation. (a) OTSU-binarized images. (b) Images after the closing operation.

Figure 10. Results of crack region labeling. (a) Segmented images. (b) Labeled images.

Figure 11. Crack skeleton image.

Figure 12. Comparison of crack segmentation results. (a) Original images. (b) Manually annotated images. (c) OTSU. (d) Canny. (e) LoG. (f) Prewitt. (g) Proposed.

Figure 13. Crack segmentation results across different backgrounds and illumination conditions. (a) Original images. (b) Proposed.

Figure 14. Crack segmentation results under noise conditions. (a) Original images. (b) Images with salt and pepper noise. (c) Segmentation results.

Figure 15. Comparison results of the image fusion module ablation experiment. (a) Original images. (b) R-FM. (c) FW-1. (d) FW-2. (e) FW-3. (f) Proposed.

Figure 16. Comparison results of the crack enhancement module ablation experiment. (a) Original images. (b) R-EM images. (c) PLT images. (d) Proposed images.

Figure 17. Runtime line graph.

Table 1. Evaluation of different crack-segmentation models on public datasets.

Algorithm	Year	Dataset	IoU (mIoU)	Precision	Recall	F1	Advantages	Disadvantages
GAF-Net [30]	2025	CFD [29]	0.536	0.740	0.944	0.811	Strong multi-scale feature extraction; high recall on complex crack patterns.	High computational cost; limited performance under low-contrast conditions.
GAF-Net [30]	2025	Crack500 [27]	0.652	0.865	0.919	0.889
GAF-Net [30]	2025	DeepCrack [28]	0.816	0.892	0.979	0.931
CrackDiff [31]	2024	Crack500 [27]	0.841	0.813	0.841	0.818	Fine-grained crack reconstruction via diffusion modeling; high IoU.	Slow inference speed; high training complexity.
CrackDiff [31]	2024	DeepCrack [28]	0.862	0.919	0.795	0.841		Slow inference speed; high training complexity.
PAFNet [32]	2023	Crack500 [27]	0.770	0.757	0.729	0.869	Lightweight architecture; stable cross-dataset performance.	Limited robustness to illumination variation; boundary refinement insufficient.
PAFNet [32]	2023	DeepCrack [28]	0.844	0.865	0.855	0.929
PAFNet [32]	2023	CFD [29]	0.549	0.661	0.600	0.816
CrackW-Net [33]	2022	Crack500 [27]	0.754	0.682	0.710	0.706	Simple architecture; low parameter complexity.	Weak adaptability to heterogeneous backgrounds.
CrackW-Net [33]	2022	DeepCrack [28]	0.824	0.777	0.823	0.800
CrackW-Net [33]	2022	CFD [29]	0.668	0.494	0.554	0.522

Table 2. J and D scores for different segmentation algorithms.

Algorithm	Sample 1		Sample 2		Sample 3		Sample 4		Avg
Algorithm	J (%)	D (%)	J (%)	D (%)	J (%)	D (%)	J (%)	D (%)	J (%)	D (%)
OTSU	7.10	13.25	3.66	7.05	4.81	9.18	3.25	6.29	4.70	8.94
Canny	9.14	16.75	24.95	39.93	23.73	38.36	30.87	47.17	22.17	35.56
LoG	19.62	32.81	7.74	14.37	25.23	40.30	17.21	29.37	17.45	29.21
Prewitt	7.33	13.66	0.57	1.14	22.86	37.21	19.57	32.74	12.58	21.19
Proposed	31.87	48.34	42.49	59.64	16.24	27.94	28.62	44.50	29.81	45.11

Table 3. Statistical results of crack grades.

Samples	Crack Area Ratio (%)	Crack Length Ratio (%)	Crack Width Ratio (%)	Risk Level
Samples 1	1.7	102	4.6 × 10⁻⁶	Hazardous Cracks
Samples 2	1.2	19	9.7 × 10⁻⁵	Cracks to be Monitored
Samples 3	0.8	21	5.2 × 10⁻⁵	Safe Cracks
Samples 4	0.8	80	1.6 × 10⁻⁶	Cracks to be Monitored

Table 4. Runtime of different methods.

Methods	Sample 1 T(s)	Sample 2 T(s)	Sample 3 T(s)	Sample 4 T(s)	Tave(s)
R-FEM	2.96	2.6	2.58	6.25	3.60
R-FM	3.07	2.74	2.73	6.37	3.73
R-EM	3.30	3.42	3.24	7.21	4.32
Proposed	3.38	3.50	3.38	7.42	4.40

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Ji, N.; Zhao, F.; Zhang, H.; Liu, Z.; Rai, L.; Deng, Z. Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation. Mathematics 2026, 14, 1235. https://doi.org/10.3390/math14071235

AMA Style

Li Y, Ji N, Zhao F, Zhang H, Liu Z, Rai L, Deng Z. Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation. Mathematics. 2026; 14(7):1235. https://doi.org/10.3390/math14071235

Chicago/Turabian Style

Li, Yong, Nian Ji, Fuzhe Zhao, Huaiwen Zhang, Zeqi Liu, Laxmisha Rai, and Zhaopeng Deng. 2026. "Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation" Mathematics 14, no. 7: 1235. https://doi.org/10.3390/math14071235

APA Style

Li, Y., Ji, N., Zhao, F., Zhang, H., Liu, Z., Rai, L., & Deng, Z. (2026). Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation. Mathematics, 14(7), 1235. https://doi.org/10.3390/math14071235

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bimodal Image Fusion and Brightness Piecewise Linear Enhancement for Crack Segmentation

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Crack Segmentation

3.1.1. Overview of the CSA-BB Algorithm

3.1.2. Bimodal Image Fusion

3.1.3. Brightness Piecewise Linear Enhancement

3.1.4. Crack Region Segmentation

3.2. Monitoring Crack Development

4. Experiments

4.1. Comparison of Crack Segmentation Results

4.2. Crack Monitoring

4.3. Ablation Studies

4.3.1. Ablation Study of the Image Fusion Module

4.3.2. Ablation Study of the Brightness Piecewise Linear Enhancement Module

4.3.3. Efficiency Analysis of the Proposed Algorithm

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI