Binarization Algorithm Based on Side Window Multidimensional Convolution Classification

Uneven illumination and space radiation can cause inhomogeneous grayscale distribution, low contrast, and noisy images in in-orbit cameras. A binarization algorithm based on morphological classification is proposed to solve the problem of inaccurate image binarization caused by space image degradation. Traditional local binarization algorithms generally calculate thresholds based on statistical information of gray dimensions within the local window, often ignoring the morphological distribution information, leading to poor results in degraded images. The algorithm presented in this paper demonstrates the property of the side window filtering (SWF) kernel on morphological clustering. First, the eight-dimensional SWF convolution kernel is used to describe the morphological properties of the pixels. Then, the positive and negative types of each pixel in the local window are identified, and the local threshold is calculated according to the difference between the two types. Finally, the positive pixel is used to filter the threshold of each pixel, with the binarization threshold satisfying the morphologically smooth and continuous property. A self-built dataset is used to evaluate the algorithm quantitatively and the results are compared with the three existing classical techniques using the quantitative measures FM, PSNR, and DRD. The experimental results show that the algorithm in this paper yields good binarization results for different degraded images, outperforms the comparison algorithm in terms of accuracy and robustness, and is insensitive to noise.


Introduction
The visual-perception camera is a fairly common piece of aerospace pose-measurement equipment, and is characterized by high accuracy, noncontact use, and low power consumption. It is widely used in tasks such as spacecraft rendezvous, care and maintenance of on-orbit load of space manipulators, and cleanup of space debris or abandoned satellites [1][2][3][4]. The visual-measurement camera uses the visual-positioning marker as the observed target, the relationship of the target feature in the image as a reference, and image preprocessing, target recognition, and attitude calculation to determine the position and attitude of the target. The visual-positioning markers are generally designed to be easily recognizable, high-contrast pattern features [5]. The general visual measurement framework is presented in Figure 1.
After image preprocessing, the binarized image can strengthen the target features and is a crucial part of identification and positioning. The binarization algorithm of the image can be expressed as: I (x, y) = 1, I(x, y) ≥ T 0, I(x, y) < T (1) ′( , ) = 1, ( , ) ≥ 0, ( , ) < (1) The main factor affecting the binarization is the calculation of threshold . Inappropriate values lead to incorrect segmentation of the foreground and background of the image, resulting in incomplete or deviated features, directly affecting the accuracy of the pose solution. In-orbit image binarization faces the following three challenges as a result of the space environment: Challenge 1: The light shifts rapidly when working on-orbit, and there is no atmosphere or other media to reflect sunlight, resulting in a considerable difference in the brightness of the direct sunlight and shadow regions. Consequently, the target grayscale seen in the image will have an uneven grayscale distribution. The target identification and localization of such uneven lighting and contrast images is a difficult task in in-orbit image processing (Figure 2b). Solving this problem can enable the camera to work continuously without the influence of ambient lighting, and can also reduce the constraints of in-orbit mission scheduling.

Challenge 2:
The space camera must function in orbit for an extended time period and is exposed to electromagnetic radiation and multiple energy particles [6]. This causes rapid deterioration of the device, compared to the ground environment, increase in the detector background noise, and decrease in the imaging dynamic range. Figure 3 shows The main factor affecting the binarization is the calculation of threshold T. Inappropriate values lead to incorrect segmentation of the foreground and background of the image, resulting in incomplete or deviated features, directly affecting the accuracy of the pose solution. In-orbit image binarization faces the following three challenges as a result of the space environment: Challenge 1: The light shifts rapidly when working on-orbit, and there is no atmosphere or other media to reflect sunlight, resulting in a considerable difference in the brightness of the direct sunlight and shadow regions. Consequently, the target grayscale seen in the image will have an uneven grayscale distribution. The target identification and localization of such uneven lighting and contrast images is a difficult task in in-orbit image processing (Figure 2b). Solving this problem can enable the camera to work continuously without the influence of ambient lighting, and can also reduce the constraints of in-orbit mission scheduling. ′( , ) = 1, ( , ) ≥ 0, ( , ) < ( The main factor affecting the binarization is the calculation of threshold . Inappr priate values lead to incorrect segmentation of the foreground and background of the im age, resulting in incomplete or deviated features, directly affecting the accuracy of th pose solution. In-orbit image binarization faces the following three challenges as a resu of the space environment: Challenge 1: The light shifts rapidly when working on-orbit, and there is no atmo phere or other media to reflect sunlight, resulting in a considerable difference in th brightness of the direct sunlight and shadow regions. Consequently, the target graysca seen in the image will have an uneven grayscale distribution. The target identification an localization of such uneven lighting and contrast images is a difficult task in in-orbit imag processing (Figure 2b). Solving this problem can enable the camera to work continuous without the influence of ambient lighting, and can also reduce the constraints of in-orb mission scheduling.

Challenge 2:
The space camera must function in orbit for an extended time perio and is exposed to electromagnetic radiation and multiple energy particles [6]. This cause rapid deterioration of the device, compared to the ground environment, increase in th detector background noise, and decrease in the imaging dynamic range. Figure 3 show

Challenge 2:
The space camera must function in orbit for an extended time period and is exposed to electromagnetic radiation and multiple energy particles [6]. This causes rapid deterioration of the device, compared to the ground environment, increase in the detector background noise, and decrease in the imaging dynamic range. Figure 3 shows considerable decrease in the image contrast of the Solar Dynamics Observatory (SDO) over more than ten years in orbit [7]. considerable decrease in the image contrast of the Solar Dynamics Observatory (SDO) over more than ten years in orbit [7].

Challenge 3:
The impact of radiation particles on the detector [8], as well as the intense temperature variation, can cause random noise in images. Figure 4 shows an image taken by star-sensitive instrument operating in orbit, containing random noise. Extensive research has been conducted to increase the accuracy and robustness of the binarization algorithm. There are global and local image-binarization methods. The global method takes the full-image pixel statistical information as the reference and uses a single threshold to segment the image into the foreground and background. The most representative algorithm is the Otsu algorithm [9], which traverses each gray level of the image grayscale histogram and selects the gray level that creates the largest variance between the foreground and background classes as the segmentation threshold. Other representative global-threshold binarization methods include the Kapur method [10] and Kittler method [11]. The segmentation results from these methods are good when the image grayscale distribution shows obvious bimodal peaks. However, when the scene lighting is uneven, a lot of information will be lost. The local binarization method sets a threshold based on the grayscale relationship between each pixel and its neighboring pixels to binarize the image pixel by pixel, with representative algorithms being the Sauvola method [12], Niblack method [13], etc. Jia et al. incorporated the structural symmetric pixels (SSPs) to calculate the local threshold in the neighborhood and the vote result of multiple thresholds [14]. Vo et al. presented a Gaussian Mixture Markov Random Field (GMMRF) model that is effective for the binarization of images with complex backgrounds [15]. These algorithms and their improved versions can retain local feature information, achieving better results in the industry-recognized DIBCO text-binarization recognition competition [16]. However, the current local binarization algorithms have a high degree of dependence on hyperparameters, and generally consider the distribution of the grayscale dimension, such as grayscale mean, variance, and entropy, without considering the morphological distribution. Deep-learning-based binarization techniques have advanced significantly in recent years. Zhao et al. formulated binarization as an image-to-image generation task and

Challenge 3:
The impact of radiation particles on the detector [8], as well as the intense temperature variation, can cause random noise in images. Figure 4 shows an image taken by star-sensitive instrument operating in orbit, containing random noise. considerable decrease in the image contrast of the Solar Dynamics Observatory (SDO) over more than ten years in orbit [7].

Challenge 3:
The impact of radiation particles on the detector [8], as well as the intense temperature variation, can cause random noise in images. Figure 4 shows an image taken by star-sensitive instrument operating in orbit, containing random noise. Extensive research has been conducted to increase the accuracy and robustness of the binarization algorithm. There are global and local image-binarization methods. The global method takes the full-image pixel statistical information as the reference and uses a single threshold to segment the image into the foreground and background. The most representative algorithm is the Otsu algorithm [9], which traverses each gray level of the image grayscale histogram and selects the gray level that creates the largest variance between the foreground and background classes as the segmentation threshold. Other representative global-threshold binarization methods include the Kapur method [10] and Kittler method [11]. The segmentation results from these methods are good when the image grayscale distribution shows obvious bimodal peaks. However, when the scene lighting is uneven, a lot of information will be lost. The local binarization method sets a threshold based on the grayscale relationship between each pixel and its neighboring pixels to binarize the image pixel by pixel, with representative algorithms being the Sauvola method [12], Niblack method [13], etc. Jia et al. incorporated the structural symmetric pixels (SSPs) to calculate the local threshold in the neighborhood and the vote result of multiple thresholds [14]. Vo et al. presented a Gaussian Mixture Markov Random Field (GMMRF) model that is effective for the binarization of images with complex backgrounds [15]. These algorithms and their improved versions can retain local feature information, achieving better results in the industry-recognized DIBCO text-binarization recognition competition [16]. However, the current local binarization algorithms have a high degree of dependence on hyperparameters, and generally consider the distribution of the grayscale dimension, such as grayscale mean, variance, and entropy, without considering the morphological distribution. Deep-learning-based binarization techniques have advanced significantly in recent years. Zhao et al. formulated binarization as an image-to-image generation task and Extensive research has been conducted to increase the accuracy and robustness of the binarization algorithm. There are global and local image-binarization methods. The global method takes the full-image pixel statistical information as the reference and uses a single threshold to segment the image into the foreground and background. The most representative algorithm is the Otsu algorithm [9], which traverses each gray level of the image grayscale histogram and selects the gray level that creates the largest variance between the foreground and background classes as the segmentation threshold. Other representative global-threshold binarization methods include the Kapur method [10] and Kittler method [11]. The segmentation results from these methods are good when the image grayscale distribution shows obvious bimodal peaks. However, when the scene lighting is uneven, a lot of information will be lost. The local binarization method sets a threshold based on the grayscale relationship between each pixel and its neighboring pixels to binarize the image pixel by pixel, with representative algorithms being the Sauvola method [12], Niblack method [13], etc. Jia et al. incorporated the structural symmetric pixels (SSPs) to calculate the local threshold in the neighborhood and the vote result of multiple thresholds [14]. Vo et al. presented a Gaussian Mixture Markov Random Field (GMMRF) model that is effective for the binarization of images with complex backgrounds [15]. These algorithms and their improved versions can retain local feature information, achieving better results in the industry-recognized DIBCO text-binarization recognition competition [16]. However, the current local binarization algorithms have a high degree of dependence on hyperparameters, and generally consider the distribution of the grayscale dimension, such as grayscale mean, variance, and entropy, without considering the morphological distribution. Deep-learning-based binarization techniques have advanced significantly in recent years. Zhao et al. formulated binarization as an image-to-image generation task and introduced the conditional generative adversarial networks (cGANs) to solve the core problem of multiscale information combination in the binarization task [17]. Westphal et al. proposed a recurrent neural network-based algorithm using Grid Long Short-Term Memory cells for image binarization, and a pseudo F-Measure-based weighted loss function [18]. In particular, the algorithm that won the first place in DIBCO2017 selected the U-Net neural-network framework and achieved a better segmentation effect using the dataexpansion strategy [19]. Although the neural-network-based binarization method achieved excellent results in the competition, serious limitations such as computing resources and dataset coverage remain unsolved in the space-application environment. There are still no neural-network-based binarization methods applied to in-orbit tasks.
To solve the binarization problem of in-orbit degraded images, a binarization algorithm based on side window filtering (SWF) multidimensional convolutional classification is proposed in this paper. SWF was proposed in 2019 in Hui Yin et al. [20]. The proposal of SWF aimed to perform edge-preserving and denoising filtering on images. SWF is an innovative image-processing theory, and the team of Hui Yin et al. has further shown that the SWF principle can be extended to other computer vision problems that involve a local operation window and a linear combination of the neighbors in this window, such as colorization by optimization. Chen et al. employed SWF for image-dehazing optimization, ensuring that texture and edge information was preserved [21]. Lu et al. suggested an improved SWF algorithm for edge-preserving denoising filtering of star maps collected by in-orbit star-sensitive sensors to improve the star point localization accuracy [22]. The above-mentioned related studies revealed that SWF has both morphological and grayscale statistical properties, so it was used as an operator for morphological clustering of local pixels, allowing the hereby proposed binarization algorithm to consider both local grayscale distribution information and morphological information.
The contributions of this paper are: 1.
The local binarization problem was transformed into a clustering problem. Additionally, images binarized by the SWF framework-based method were demonstrated to have higher local information than traditional methods.

2.
An SWF-based binarization algorithm was designed for space images with uneven illumination, low contrast, and noise. The results showed the effectiveness of the method for degraded images.

3.
A ground-test environment was designed using real cooperative targets and a test set was generated by changing illumination, shadows, and noise. The test set was then used to quantitatively evaluate the effect of binarization. The test set is openly available for further algorithm research.
The remainder of this paper is organized as follows: The motivation of the proposed work is discussed in Section 2. The implementation of the proposed method is described in Section 3. The experimental results are presented in Section 4. Finally, Section 5 includes the conclusions.

Motivation of the Proposed Method
The previous analysis of in-orbit image characteristics revealed the most important impact to be contrast reduction. Figure 5 depicts the histograms of a high-contrast image and a low-contrast image in the same scene. The difference between the foreground and background of the high-contrast image is large, and the binarization threshold is more tolerant of errors. Conversely, the difference is less than 20 in the low-contrast image, and a small threshold fluctuation can lead to segmentation errors. In this section, the improvement of the binarization accuracy by adding morphological-dimensional information is analyzed.
In a local window containing foreground and background, it is assumed that the binarization algorithm has the following properties: 1.
If the local window contains both foreground and background information, then the current pixel must belong to one of the two categories, and conversely, the pixel that differs significantly from the current pixel belongs to the other category.

2.
Except for single-point noise, each pixel in the local window, including foreground and background, should be locally continuous, smooth, and have a threshold approximation to pixels of the same category. In a local window containing foreground and background, it is assumed that the binarization algorithm has the following properties: 1 If the local window contains both foreground and background information, then the current pixel must belong to one of the two categories, and conversely, the pixel that differs significantly from the current pixel belongs to the other category. 2 Except for single-point noise, each pixel in the local window, including foreground and background, should be locally continuous, smooth, and have a threshold approximation to pixels of the same category.
According to the above properties, the local binarization problem can be transformed into a clustering problem based on the current pixel. According to property (1), the current pixel is clustered as the benchmark in the local window, the class consistent with the current pixel in the window and the largest difference class are determined, and the characteristics of the continuity of the same pixel are considered according to property (2).
A local window containing both foreground and background can also be considered to contain a large grayscale variation. To facilitate the analysis, a typical step edge was used for the analysis, and the 2D grayscale distribution is shown in Figure 6. Step edge grayscale distribution map; pixel "a" and "b" are on the edge.
Points "a" and "b" in Figure 6 are the step points of grayscale at the edge. Symbols "a+" and "a−" were used to describe the left ( − , ) and right limits ( , ) of point "a", respectively. The following conditions are true due to the grayscale step, . The functions are analyzed through Taylor expansion as follows: Assuming that = and that the image is differentiable at , Formula (2) yields: Similarly, assuming that = − : According to the above properties, the local binarization problem can be transformed into a clustering problem based on the current pixel. According to property (1), the current pixel is clustered as the benchmark in the local window, the class consistent with the current pixel in the window and the largest difference class are determined, and the characteristics of the continuity of the same pixel are considered according to property (2).
A local window containing both foreground and background can also be considered to contain a large grayscale variation. To facilitate the analysis, a typical step edge was used for the analysis, and the 2D grayscale distribution is shown in Figure 6. In a local window containing foreground and background, it is assumed that the binarization algorithm has the following properties: 1 If the local window contains both foreground and background information, then the current pixel must belong to one of the two categories, and conversely, the pixel that differs significantly from the current pixel belongs to the other category. 2 Except for single-point noise, each pixel in the local window, including foreground and background, should be locally continuous, smooth, and have a threshold approximation to pixels of the same category.
According to the above properties, the local binarization problem can be transformed into a clustering problem based on the current pixel. According to property (1), the current pixel is clustered as the benchmark in the local window, the class consistent with the current pixel in the window and the largest difference class are determined, and the characteristics of the continuity of the same pixel are considered according to property (2).
A local window containing both foreground and background can also be considered to contain a large grayscale variation. To facilitate the analysis, a typical step edge was used for the analysis, and the 2D grayscale distribution is shown in Figure 6. Step edge grayscale distribution map; pixel "a" and "b" are on the edge.
Points "a" and "b" in Figure 6 are the step points of grayscale at the edge. Symbols "a+" and "a−" were used to describe the left ( − , ) and right limits ( , ) of point "a", respectively. The following conditions are true due to the grayscale step, ( − , ) ( , ) and ( − , ) ′( , ). The functions are analyzed through Taylor expansion as follows: Assuming that = and that the image is differentiable at , Formula (2) yields: Similarly, assuming that = − : Step edge grayscale distribution map; pixel "a" and "b" are on the edge.
Points "a" and "b" in Figure 6 are the step points of grayscale at the edge. Symbols "a+" and "a−" were used to describe the left (x − ε, y) and right limits (x + ε, y) of point "a", respectively. The following conditions are true due to the grayscale step, . The functions are analyzed through Taylor expansion as follows: Assuming that x 0 = x + ε and that the image is differentiable at x 0 , Formula (2) yields: Similarly, assuming that x 0 = x − ε: The "a+" class in Figure 6 is the same category as "a", and the "a−" class is a different category from "a". Formulas (3) and (4) show that if the pixel is on one side of the edge, the pixel that is more strongly correlated with it, (i.e., the same pixel) must be morphologically distributed on the same side of the edge as the pixel, so descriptors that describe the characteristics of pixels need to reduce the impact caused by crossing the boundary during pixel clustering, and cannot place pixels in the center of the window for statistical information. Each pixel is assumed to be treated as a potential edge pixel, and when a pixel is at an edge position in the image, the main idea of SWF is that it is more appropriate to align the edge of the convolution window with the center pixel, rather than aligning the center of the convolution window with the center pixel.
Multiple weighted subtemplates were generated according to the aforementioned SWF idea, the edge or corner positions of these subwindows were aligned with the current binarized pixel points, and the convolution result of each subtemplate was obtained. According to the difference between the convolution result and the currently processed pixel, the pixels with the smallest difference from the current pixel in the convolution result are similar pixels, and the pixels with the largest difference are dissimilar pixels. Thus the local window binary classification is achieved. The regions of the two categories are more likely to contain pixels with larger grayscale variations; in other words, greater computational weights are assigned to these pixels.
A set of test images was generated to verify the improvement (Figure 7). The resolution of each image was 21 × 21, and the foreground pixel was gradually expanded in the diagonal direction. Each image was used as a local window.
The "a+" class in Figure 6 is the same category as "a", and the "a−" class is a different category from "a". Formulas (3) and (4) show that if the pixel is on one side of the edge, the pixel that is more strongly correlated with it, (i.e., the same pixel) must be morphologically distributed on the same side of the edge as the pixel, so descriptors that describe the characteristics of pixels need to reduce the impact caused by crossing the boundary during pixel clustering, and cannot place pixels in the center of the window for statistical information. Each pixel is assumed to be treated as a potential edge pixel, and when a pixel is at an edge position in the image, the main idea of SWF is that it is more appropriate to align the edge of the convolution window with the center pixel, rather than aligning the center of the convolution window with the center pixel.
Multiple weighted subtemplates were generated according to the aforementioned SWF idea, the edge or corner positions of these subwindows were aligned with the current binarized pixel points, and the convolution result of each subtemplate was obtained. According to the difference between the convolution result and the currently processed pixel, the pixels with the smallest difference from the current pixel in the convolution result are similar pixels, and the pixels with the largest difference are dissimilar pixels. Thus the local window binary classification is achieved. The regions of the two categories are more likely to contain pixels with larger grayscale variations; in other words, greater computational weights are assigned to these pixels.
A set of test images was generated to verify the improvement (Figure 7). The resolution of each image was 21 × 21, and the foreground pixel was gradually expanded in the diagonal direction. Each image was used as a local window. The Box form method was compared with the SW form method. The Box form output binarization threshold is the average value of the window, with all pixels participating in the threshold calculation. In contrast, the SW method uses the identified similar and dissimilar regions to calculate the threshold. The amount of local information in terms of the within-class variance is calculated as: Where is the number of foreground pixels, is the number of background pixels, and is the binarization threshold. The within-class variance of the two methods is shown in Figure 8, where it can be seen that the SW results are always larger than the Box, indicating that SW is able to retain more information. The Box form method was compared with the SW form method. The Box form output binarization threshold is the average value of the window, with all pixels participating in the threshold calculation. In contrast, the SW method uses the identified similar and dissimilar regions to calculate the threshold. The amount of local information in terms of the within-class variance is calculated as: where N f is the number of foreground pixels, N b is the number of background pixels, and t is the binarization threshold. The within-class variance of the two methods is shown in Figure 8, where it can be seen that the SW results are always larger than the Box, indicating that SW is able to retain more information.

Implementation of the Proposed Method
The algorithm flow of this paper is shown in Figure 9. The eight-dimensional SWF convolution kernel was defined. Each convolution result was output to be compared with the current pixel for clustering. Positive-and negative-class pixels were obtained according to the difference. Side window information was used as reference to calculate the bi-

Implementation of the Proposed Method
The algorithm flow of this paper is shown in Figure 9. The eight-dimensional SWF convolution kernel was defined. Each convolution result was output to be compared with the current pixel for clustering. Positive-and negative-class pixels were obtained according to the difference. Side window information was used as reference to calculate the binarization threshold. The threshold of each pixel was smoothed with its similar pixels, and finally, thd_map_re f ine was used as the threshold to binarize the input image.

Implementation of the Proposed Method
The algorithm flow of this paper is shown in Figure 9. The eight-dimensional SWF convolution kernel was defined. Each convolution result was output to be compared with the current pixel for clustering. Positive-and negative-class pixels were obtained according to the difference. Side window information was used as reference to calculate the binarization threshold. The threshold of each pixel was smoothed with its similar pixels, and finally, ℎ _ _ was used as the threshold to binarize the input image.  Figure 9. Flow chart of the proposed algorithm. Figure 9. Flow chart of the proposed algorithm.

Definition of Side Window Core
According to the SWF idea described in Section 2, the current pixel must be placed at the edge or corner of the description subtemplate and the templates to have continuity in morphology. The local window of a pixel is defined as a (2r + 1) × (2r + 1) square window, and eight-dimensional subtemplates were used to describe the local features. The pixel was aligned with the edge of the template to generate four convolution kernels of Left (L), Right (R), Up (U), and Down (D), and the current pixel was aligned at the corner of the template to generate the Southwest (SW), Southeast (SE), Northeast (NE), Northwest (NW) convolution kernels, as shown in Figure 10.

Definition of Side Window Core
According to the SWF idea described in Section 2, the current pixel must be placed at the edge or corner of the description subtemplate and the templates to have continuity in morphology. The local window of a pixel is defined as a (2r + 1) × (2r + 1) square window, and eight-dimensional subtemplates were used to describe the local features. The pixel was aligned with the edge of the template to generate four convolution kernels of Left (L), Right (R), Up (U), and Down (D), and the current pixel was aligned at the corner of the template to generate the Southwest (SW), Southeast (SE), Northeast (NE), Northwest (NW) convolution kernels, as shown in Figure 10. were obtained, respectively. The radius of the convolution kernel was a hyperparameter that could be flexibly defined according to the target.

Step 1: Coarse Threshold Calculation
Morphological clustering on the local window according to the output of SWF was performed by the following steps. The output with the smallest difference from the current pixel is the same type, which is also on the same side of the edge, and the output with the largest difference is the dissimilar type, which is also on the dissimilar side of the

Step 1: Coarse Threshold Calculation
Morphological clustering on the local window according to the output of SWF was performed by the following steps. The output s with the smallest difference from the current pixel is the same type, which is also on the same side of the edge, and the output s with the largest difference is the dissimilar type, which is also on the dissimilar side of the edge. The differences were quantified using the L1 distance.
where q i is the grayscale of the current pixel and s ij is the output of the j-th core. The output of the same type should be the same as or as close as possible to the input at an edge, and on the other hand, the output of the dissimilar type should be far away from the input. Therefore, the output of the side window that has the minimum/maximum L1 distance to the input intensity was chosen as the clustering output.
The descriptor {s N , s P , ∆s N , ∆s P , w} of each pixel is calculated based on the convolution kernel, where s N is the SWF output of the pixel with the largest difference; s P is the SWF output of the pixel with the smallest difference; ∆s N is the difference between the pixel with the largest difference and the current pixel; ∆s P is the difference between the pixel with the smallest difference and the current pixel; and w is the index of the kernel of the same type. For example, if the output result of NE has the minimum difference from the current pixel, then w = 6. In Figure 11, the grayscale of pixel q1 is 93, and the result of convolution with SWF is s q1 = {95.8, 60.8, 71.0, 90.4, 68.8, 56.6, 93.2, 83.4}, the maximum difference is the NE output result, the minimum difference is the SW output result, the NE region is the dissimilar class, and the SW region is the similar class. For pixel q2, the dissimilarity is the D and the NE is the similar class.
Sensors 2022, 22, x FOR PEER REVIEW 9 pixel with the largest difference and the current pixel; ∆ is the difference betwee pixel with the smallest difference and the current pixel; and is the index of the k of the same type. For example, if the output result of NE has the minimum difference the current pixel, then = 6. In Figure 11, the grayscale of pixel q1 is 93, and the r of convolution with SWF is = 95. 8  If the current pixel is in the smoothed area, the difference of the SWF output is small, indicating that possibly only one class of pixels is present in the local win which does not contain both foreground and background classes. This is a common lem in local binarization algorithms and may lead to oversegmentation. Therefore, t dress this problem, the local contrast was calculated according to Equation (8), an hyperparameter local contrast threshold ℎ_ is defined. When the local contrast i than the set threshold, a preset threshold is used for such pixels, for example, us preset constant binarization threshold, and the global Otsu method threshold is re mended.
= − Figure 11. SWF clustering visualization. SW is the positive class of q1, NE is the negative class of q1, NE is the positive class of q2, and D is the negative class of q2.
If the current pixel is in the smoothed area, the difference of the SWF output is very small, indicating that possibly only one class of pixels is present in the local window, which does not contain both foreground and background classes. This is a common problem in local binarization algorithms and may lead to oversegmentation. Therefore, to address this problem, the local contrast was calculated according to Equation (8), and the hyperparameter local contrast threshold h_c is defined. When the local contrast is less than the set threshold, a preset threshold is used for such pixels, for example, using a preset constant binarization threshold, and the global Otsu method threshold is recommended.
If the contrast is high enough to satisfy the threshold, that is, foreground and background classes are present locally, the threshold of the current pixel is thd_map ch=0,i = s N +s P 2 , and the index of the same SWF convolution kernel is also recorded as thd_map ch=1,i = w, where i is the index of the pixel, ch = 0 indicates the threshold channel of thd_map, and ch = 1 represents the index channel of similar convolution kernels of thd_map. All pixels were traversed to obtain the thd_map of the entire image.

Step 2: Threshold Refinement
Refinement on the obtained thd_map was performed according to property (2) mentioned in Section 2, namely that the local same pixels should have continuous and approximate thresholds morphologically. According to the previous calculation results, the index of the same type corresponding to each pixel is already known, and the area contained in the same template are the pixels of same type. The thresholds of the same class pixels are used for mean filtering on the current pixel threshold by Formula (9). The low-contrast pixels in the same template region do not participate in the smoothing calculation.
where thd_map_re f ine is the threshold after refinement, i is the pixel index, N is the number of pixels contained in the similar template, j is the pixel index within the similar template, S is the similar region, and thd j is the threshold value for each pixel in the similar region. Threshold thd_map_re f ine was used as the final threshold to binarize each pixel to obtain the entire binary_image. Figure 12 shows the binarization-threshold heatmap calculated by the three local binarization methods, namely the Bernsen [23], Sauvola, and proposed methods, on the degraded-image test set. The three methods use the same local window size, as can be seen from Figure 12; compared with the Bernsen and Sauvola, the threshold distribution obtained by the proposed method is closer to the original image in morphology, indicating that it is more sensitive to changes in image morphology. The quantitative test results are further discussed in Section 4.
where ℎ _ _ is the threshold after refinement, is the pixel index, is the number of pixels contained in the similar template, is the pixel index within the similar template, is the similar region, and ℎ is the threshold value for each pixel in the similar region. Threshold ℎ _ _ was used as the final threshold to binarize each pixel to obtain the entire _ . Figure 12 shows the binarization-threshold heatmap calculated by the three local binarization methods, namely the Bernsen [23], Sauvola, and proposed methods, on the degraded-image test set. The three methods use the same local window size, as can be seen from Figure 12; compared with the Bernsen and Sauvola, the threshold distribution obtained by the proposed method is closer to the original image in morphology, indicating that it is more sensitive to changes in image morphology. The quantitative test results are further discussed in Section 4.  Details of the procedure are described in Algorithm 1.

Algorithm 1: Calculate threshold based on SWF
Input: q i is the grayscale of the target pixel i, w S ij is the weight of pixel j, which is in the neighborhood of the target pixel i, based on kernel S = {L, R, U, D, NW, NE, SW, SE} is the set of side window index, h_c and h_t are hyperparameters Output: binary_image;

Experiments
Extensive experiments were performed to evaluate the performance of the proposed method. In this section, the self-built dataset used for testing is introduced, which was used to simulate degraded in-orbit images in orbit. The proposed binarization method was then quantitatively compared with other classical algorithms. All the following work was implemented on a PC (I7-10710U at 4.7 GHz, 16 GB of RAM), and the simulation tool was MATLAB R2019a.

Datasets
A test system was designed to simulate the uneven on-orbit illumination environment, taking the Shenzhou spacecraft docking and cooperation marker as the target. The test system is shown in Figure 13. As the sunlight in outer space is intense and highly directional, a strong light was employed to simulate the sunlight, and the illuminance at the target exceeded 120,000 lx. Test images of different distributions of light and shadows were captured.

Experiments
Extensive experiments were performed to evaluate the performance of the proposed method. In this section, the self-built dataset used for testing is introduced, which was used to simulate degraded in-orbit images in orbit. The proposed binarization method was then quantitatively compared with other classical algorithms. All the following work was implemented on a PC (I7-10710U at 4.7 GHz, 16 GB of RAM), and the simulation tool was MATLAB R2019a.

Datasets
A test system was designed to simulate the uneven on-orbit illumination environment, taking the Shenzhou spacecraft docking and cooperation marker as the target. The test system is shown in Figure 13. As the sunlight in outer space is intense and highly directional, a strong light was employed to simulate the sunlight, and the illuminance at the target exceeded 120,000 lx. Test images of different distributions of light and shadows were captured. The target background was made of antiatomic-oxygen flame-retardant cloth, which shows strong differences under different illumination. In order to eliminate the influence of this difference on the quantitative assessment of the binarization effect, the mask area of the target was extracted and the binarization results were quantitatively compared only for the mask area. The ground truth (GT) of the binarized image was obtained by manu- Figure 13. Dataset-generation platform.
The target background was made of antiatomic-oxygen flame-retardant cloth, which shows strong differences under different illumination. In order to eliminate the influence of this difference on the quantitative assessment of the binarization effect, the mask area of the target was extracted and the binarization results were quantitatively compared only for the mask area. The ground truth (GT) of the binarized image was obtained by manually fine-tuning the image under uniform illumination conditions as shown in Figure 14. The images of the test set were captured by changing the lighting conditions, leaving the positional relationship between the marker and the camera as is, so it can be considered that the GT and each image in the test set were aligned at the pixel level. To augment the test set, 1% salt and pepper noise was added to individual test-set images, and the final test set contained 7 uniformly illuminated images and 37 unevenly illuminated images.

Quantitative Evaluation
A total of 1 image with uniform illumination, 37 images with uneven illumination, and 1 image with noise were selected for testing. The proposed method was compared with three existing binarization techniques, namely Otsu, Bernsen, and Sauvola. Otsu is a global binarization method, while the other two are local.
Equation (10) shows the formula of the Sauvola method, where ( , ) is the current pixel grayvalue, ( , ) is the standard deviation of local window, is the dynamic range of standard deviation, and is the scaling factor with positive values.
The Bernsen method computes the local threshold ( , ) using local extrema and within the neighboring window, with parameter being the contrast threshold. A preset threshold is used in the uniform region.
Both the above-mentioned local methods use two hyperparameters that are highly influential in all binarization results. For the convenience of comparison, the local window size of the three local binarization methods of Bernsen, Sauvola, and the proposed algorithm were all 21. Bernsen's local contrast threshold was 15. Sauvola is more sensitive to local contrast, so two parameters = 0.5 and = 0.8 were chosen. The ℎ_ of the proposed algorithm was 0.05. Bernsen and the proposed algorithm both use the threshold calculated by the Otsu algorithm in the uniform area. Due to space limitations, the renderings and quantitative evaluation results of some test sets are presented in this paper. Figure 15 shows the comparison renderings of some images in the test set using the four algorithms. The images of the test set were captured by changing the lighting conditions, leaving the positional relationship between the marker and the camera as is, so it can be considered that the GT and each image in the test set were aligned at the pixel level. To augment the test set, 1% salt and pepper noise was added to individual test-set images, and the final test set contained 7 uniformly illuminated images and 37 unevenly illuminated images.

Quantitative Evaluation
A total of 1 image with uniform illumination, 37 images with uneven illumination, and 1 image with noise were selected for testing. The proposed method was compared with three existing binarization techniques, namely Otsu, Bernsen, and Sauvola. Otsu is a global binarization method, while the other two are local.
Equation (10) shows the formula of the Sauvola method, where I(x, y) is the current pixel grayvalue, s(x, y) is the standard deviation of local window, R is the dynamic range of standard deviation, and k is the scaling factor with positive values.
T(x, y) = I(x, y) × 1 + k × ( s(x, y) R + 1) The Bernsen method computes the local threshold T(x, y) using local extrema I max and I min within the neighboring window, with parameter c being the contrast threshold. A preset threshold is used in the uniform region.
T(x, y) = (I max + I min )/2, (I max + I min ) < c preset, else Both the above-mentioned local methods use two hyperparameters that are highly influential in all binarization results. For the convenience of comparison, the local window size of the three local binarization methods of Bernsen, Sauvola, and the proposed algorithm were all 21. Bernsen's local contrast threshold c was 15. Sauvola is more sensitive to local contrast, so two parameters k = 0.5 and k = 0.8 were chosen. The h_c of the proposed algorithm was 0.05. Bernsen and the proposed algorithm both use the threshold calculated by the Otsu algorithm in the uniform area. Due to space limitations, the renderings and quantitative evaluation results of some test sets are presented in this paper. Figure 15 shows the comparison renderings of some images in the test set using the four algorithms. The binarization results of the four algorithms were then qualitatively compared. The Otsu method loses more information when the image has uneven gray distribution and can achieve better results when the brightness is uniform. The Bernsen algorithm achieves better results for images with uneven grayscale. The Sample 1 + Noise sample results show that the Bernsen method is affected by noise more than other methods. The Sauvola method is greatly affected by local contrast, and it is difficult to use a single set of parameters to take into account different images with relatively large contrast differences. The method proposed in this paper has better adaptability to different illuminations and retains most of the target features on uniformly illuminated, unevenly illuminated, and noisy images.
An ensemble of evaluation measures was used that are suitable and have been used in recent international binarization competitions, including FM (F-measure), PSNR (peak signal-to-noise ratio), and DRD (distance reciprocal distortion). These metrics define the similarity percentage between the resulting binarized image and GT image.
FM is the weighted harmonic mean of precision (P) and recall (R) that can determine overall binarization accuracy. High values of these three measures indicate more accurate results between the binarized image and the ideal binary image . The best result is achieved when FM is 1.
where, , , denote the true-positive, false-positive, and false-negative values, respectively. The binarization results of the four algorithms were then qualitatively compared. The Otsu method loses more information when the image has uneven gray distribution and can achieve better results when the brightness is uniform. The Bernsen algorithm achieves better results for images with uneven grayscale. The Sample 1 + Noise sample results show that the Bernsen method is affected by noise more than other methods. The Sauvola method is greatly affected by local contrast, and it is difficult to use a single set of parameters to take into account different images with relatively large contrast differences. The method proposed in this paper has better adaptability to different illuminations and retains most of the target features on uniformly illuminated, unevenly illuminated, and noisy images.
An ensemble of evaluation measures was used that are suitable and have been used in recent international binarization competitions, including FM (F-measure), PSNR (peak signal-to-noise ratio), and DRD (distance reciprocal distortion). These metrics define the similarity percentage between the resulting binarized image and GT image.
FM is the weighted harmonic mean of precision (P) and recall (R) that can determine overall binarization accuracy. High values of these three measures indicate more accurate results between the binarized image I B and the ideal binary image I GT . The best result is achieved when FM is 1.
where, TP, FP, FN denote the true-positive, false-positive, and false-negative values, respectively. PSNR measures how close a binary image is to the GT image, with higher values indicating better results. Note that the difference between foreground and background equals C (C = 255). DRD was introduced by Lu et al. and has been used to measure the visual distortion in binary images [24]. This method focuses more on the performance of images for human perception. The calculation formula is as follows: where NUBN is the count of the 8 × 8 blocks that are not all black or white pixels in the GT image and DRD k is the distortion of the k-th flipped pixel at (x,y) in the binarization result image B, computed using a 5 × 5 normalized weight matrix W Nm as defined in [24].
In contrast to the first two methods, better binarization effect yields lower DRD values.
The results in Tables 1-3 and Figures 16-18 show that in all test sets, the proposed algorithm outperformed the other algorithms in terms of in F-Measure, PSNR, and DRD. Compared with other binarization algorithms, the quantitative metrics of the proposed algorithm fluctuate less on test images, which proves that the proposed method is more adaptable to degraded images in addition to yielding higher accuracy of binarization segmentation.

Running Time
The processing times of Bernsen, Sauvola, and the proposed method were compared. Because the efficiency of the local binarization algorithm is mostly determined by the size of the local window, the efficiency on different window radii was compared. A mono image with a resolution of 480 × 270 was used for testing. Although the algorithm in this paper has multiple templates of convolutional operations, SWF clustering can also be regarded as a kind of dimensionality reduction operation, which decreases the subsequent

Running Time
The processing times of Bernsen, Sauvola, and the proposed method were compared. Because the efficiency of the local binarization algorithm is mostly determined by the size of the local window, the efficiency on different window radii r was compared. A mono image with a resolution of 480 × 270 was used for testing. Although the algorithm in this paper has multiple templates of convolutional operations, SWF clustering can also be regarded as a kind of dimensionality reduction operation, which decreases the subsequent computational cost. In addition, the efficiency of convolutional operations can be substantially improved by accelerating, and there is no time-consuming calculation such as standard deviation in the algorithm. As shown in Figure 19, the proposed method has a higher efficiency than Bernsen and Sauvola methods on different window sizes.

Running Time
The processing times of Bernsen, Sauvola, and the proposed method were compared. Because the efficiency of the local binarization algorithm is mostly determined by the size of the local window, the efficiency on different window radii was compared. A mono image with a resolution of 480 × 270 was used for testing. Although the algorithm in this paper has multiple templates of convolutional operations, SWF clustering can also be regarded as a kind of dimensionality reduction operation, which decreases the subsequent computational cost. In addition, the efficiency of convolutional operations can be substantially improved by accelerating, and there is no time-consuming calculation such as standard deviation in the algorithm. As shown in Figure 19, the proposed method has a higher efficiency than Bernsen and Sauvola methods on different window sizes.

Conclusions
A binarization approach based on morphology clustering was proposed to solve the problem of in-orbit degraded image binarization. The algorithm in this paper overcomes the shortcomings of the traditional local binarization method, which rarely considers morphological statistical information. The side window operator was used to extract the local

Conclusions
A binarization approach based on morphology clustering was proposed to solve the problem of in-orbit degraded image binarization. The algorithm in this paper overcomes the shortcomings of the traditional local binarization method, which rarely considers morphological statistical information. The side window operator was used to extract the local morphological features of pixels for clustering, and the local threshold was calculated based on the difference between local homogeneity and heterogeneity. Similar pixel thresholds were used to filter each pixel threshold based on the property of smooth continuity of similar pixel thresholds. The effectiveness of the proposed algorithm was validated by constructing a test dataset that can simulate in-orbit degraded images and can quantitatively evaluate the effectiveness of the binarization algorithm. Intensive experiments have fully validated that the algorithm is suitable for degraded-image binarization under in-orbit conditions, and compared with the Otsu, Bernsen, and Sauvola methods commonly used in the industry, the proposed algorithm has stronger accuracy and robustness.

Data Availability Statement:
The data used to support the findings of this study are available from the corresponding author upon request.