#### 2.1. Adaptive Threshold Segmentation

The adaptive threshold segmentation algorithm is different from its traditional counterparts (e.g., OTSU algorithm [

15]) which apply the same threshold to the entire image for segmentation. In contrast, the proposed algorithm applies a smoothing operator to the image, and then finds the difference between the original image and the smoothed one. Later, a fixed threshold is applied to the difference image, that is, how much the target object is brighter than the background, to achieve binarization. Suppose that

f(

x,

y) represents the original hub X-ray image to be segmented,

g(

x,

y) represents the image resulting from smoothing, and

B(

x,

y) represents the final segmentation result. Let

T be the specified fixed threshold. Then, the adaptive threshold segmentation algorithm can be expressed as:

Equation (1) has an equivalent expression as follows:

For the original image

f(

x,

y), the actual threshold of each pixel (

x,

y) is the sum of the background gray value

g, obtained by applying the smoothing operator to that point, and the specified threshold

T, and so it varies with the background gray value

g at that point. The gray value

g on each pixel of the smoothed image

g(

x,

y) is jointly determined by the gray level of the corresponding pixels of the original image

f(

x,

y) and the peripheral pixels. Assuming that the smoothing operator is

h(

x,

y), then

g(

x,

y) is obtained from the following formula:

The symbol “*” indicates a convolution operation in the digital signal processing. The smoothing operator

h(

x,

y) appears in the form of a matrix, usually being the mean smoothing operators and Gaussian smoothing operators. Taking the mean smoothing operator as an example, the

h(

x,

y) expression when the size

r is 3 shall be

Using the operator in Formula (4), the original image

f(

x,

y) is operated according to Formula (3), and the gray value g of each point in the image

g(

x,

y) that has been obtained is the average gray scale of a total of 9 pixels in a 3 × 3 square area that takes that point as the center.

Figure 1a shows a part of the gray value of the X-ray image of the hub, and the image

g(

x,

y) obtained through 3 × 3 mean smoothing is shown in

Figure 1b. Assuming that the fixed threshold

T is set to 5, the actual threshold value of each point in

Figure 1a when it is processed by binarization based on Formula (2) is the gray value corresponding to that point in

Figure 1b plus 5, that is,

Figure 1c. Finally, the binarization result of

Figure 1a is equivalent to selecting a target object whose gray scale is larger than the average background by 5 in 3 × 3 local area, as shown in

Figure 1d.

Figure 2a shows a part of a hub X-ray image. There is an obvious shrinkage cavity in this part, but it is of a small proportion of the image. It is also noted that the background gray of the image varies widely. The image is segmented with the OTSU algorithm and the threshold is found to be 162, so each pixel with a gray value greater than 162 is taken as a pixel from a target object. Those with a value below 162 are taken as one from the background. The segmentation result is shown in

Figure 2b. As can be seen, the target object, or the shrinkage cavity, has not been accurately extracted. The image obtained by smoothing

Figure 2a with a 25 × 25 mean filter is shown in

Figure 2c, in which the gray value of each pixel is the mean of a 25 × 25 square centered on this pixel in

Figure 2a. From Equations (1) or (2), with the threshold

T taken as 2, the resulting segmentation is as shown in

Figure 2d. It is clear that the adaptive threshold segmentation produces a much better result than the OTSU algorithm, because the defect in the image has been segmented with accuracy. The reason is that the adaptive threshold segmentation algorithm takes advantage of the fact that a target object is brighter among its local background. In regions without a target object, the gray scale changes in a graded manner, with the result that the original gray scale differs not much from the smoothed one (less than or equal to

T), and is then regarded as the background. In contrast, a region with a target object changes drastically in gray value, with the result that the original gray differs significantly from the smoothed one (greater than

T), and is then regarded as an object. The adaptive threshold segmentation algorithm focuses on local gray variation and is therefore more robust than fixed threshold segmentation algorithms when the target objects are smaller and the background gray is more complex, but undesirably, the noise points and the edges of light regions or dark regions in the image, which change drastically too in gray, are segmented out too. As shown in

Figure 2d, some portions not related to defects are segmented out.

The above analysis shows that the smoothing factor size and the threshold value are two determining factors in the adaptive threshold segmentation algorithm. Suitable operators for image smoothing are mean filtering, Gaussian filtering, or median filtering operators, the size of which is the size of the local area. This size determines the size of the objects that can be segmented. Too small a filter size is unable to give an accurate estimate of the local background brightness at the center of the object, resulting in segmentation failure. The larger the filter size, the better the filtered result will represent the local background, and the more likely the object is accurately segmented out. But too large a filter size will result in higher computational load, and adjacent objects, too, may have an undesirable effect on the filtering results. Experience suggests that when the filter is about the size of the object to be recognized, an accurate estimation of the background gray level of the defect and an accurate segmentation of the defect can be obtained at once. The value of the threshold

T varies with the object to be segmented. A larger threshold suppresses the noise better, but may lead to the loss of the edge pixels of the target object, resulting in incomplete segmentation. A smaller threshold ensures that the target object is completely segmented, but noise and light and dark edges may exert some influence. For

Figure 2a, the segmentation results of a smaller and a larger filter size but with the

T value maintained at 2 are shown in in

Figure 3a,b, respectively, while the segmentation results of a smaller

T value and a larger

T value but with the filter size maintained unchanged are shown in in

Figure 3c,d, respectively.

Figure 3a shows the segmentation result when the filter for smoothing is set at 9 × 9, the smaller size. Too small a filter window leads to local background estimation inaccuracy, and compared with

Figure 2d, the defect is not a whole one but consists of discrete pieces.

Figure 3b shows the segmentation result when the filter size is increased to 51 × 51. Although the defect is segmented out as a whole compared with

Figure 2d, too large a filtering window makes the hub geometry interfere with the defect area, such that the two come together, resulting in segmentation failure.

Figure 3c shows the segmentation result when the threshold

T is set to 0, and the noise interference and the effect of the light and dark edges are significantly stronger than in

Figure 2d.

Figure 3d shows the segmentation result when the threshold

T becomes 20, and it can be seen that both noise and edge interference disappear, but the defect is just partially segmented out.

It becomes clear now that too large or too small a smoothing window size or threshold value affects the final segmentation result. The two parameters, smoothing window size and threshold value, have to be determined intelligently to bring about perfect segmentation results. However, in practical applications, it is hard to achieve this goal, especially in defect detection of wheel X-ray images.

#### 2.2. Morphological Reconstruction

Mathematical morphology originated from the geometric study of the permeability of porous media by French scholars in the 1960s. It was initially confined to the geometrical analysis of binary images, but slowly expanded to the field of grayscale and color images. Mathematically, it is based on set theory, integral geometry, and mesh algebra. It has gradually developed into a powerful image analysis technology, widely used in the field of industrial nondestructive testing [

16,

17]. Mathematical morphology detects images through a small set called structuring element, and its basic operations include dilation and erosion. From this as the basis, other transformations are made through combinations. Reconstruction of mathematical morphology involves two images, one called a mask image and the other called a marker image, with the latter being smaller than or equal to its corresponding mask image. Reconstruction transformation is an iterative process in which the marker image is used to reconstruct the mask image. The operation process begins with dilating the marker image using 3 × 3 all one-square structuring elements, and the dilation result is compared to the mask image point by point and the ones with the lower value are taken as the intermediate result. This intermediate result then replaces the marker image to start another round of dilation and point-by-point comparison with the mask image, and similarly, the ones with the lower value are taken as the intermediate result. This iteration continues until the intermediate result changes no more, and this intermediate result is taken as the final reconstruction result. Let

m(

x,

y) represent the marker image,

f(

x,

y) represent the mask image, and

R represent the reconstruction process. Then, the morphological reconstruction operation can be expressed as:

where

f_{R}(

x,

y) is the result of the reconstruction operation. Reconstruction operation attempts to restore the mask image

f(

x,

y) using the marker image

m(

x,

y), and the light regions on

m(

x,

y) that have completely disappeared will not be restored in

f_{R}(

x,

y), but light regions partially shown in

m(

x,

y) are fully recovered in

f_{R}(

x,

y). The result of reconstruction operation, with the image in

Figure 3d as the marker image

m(

x,

y) and the image in

Figure 2d as the mask image

f(

x,

y), is as shown in

Figure 4. It can be seen that the defect region is completely recovered, and noise and edge interference is removed too, producing a perfect defect segmentation.

If the high-threshold segmentation image is used to reconstruct the low-threshold segmentation image in the adaptive threshold segmentation algorithm, then the reconstruction result will completely recover the low-threshold segmentation regions tagged by the high-threshold segmentation, completely removing the low-threshold segmentation regions not tagged by the high-threshold segmentation. In other words, morphological reconstruction operation combines the advantages of high-threshold segmentation (low interference) and low-threshold segmentation (complete defects), and thus qualifies as a useful supplement to adaptive threshold segmentation algorithm and minimizes the difficulty of parameter setting for segmentation.