A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram

Dong, Lijie; Zhang, Kailong; He, Mingyue; Zhong, Shenxin; Ou, Congjie

doi:10.3390/app15179808

Open AccessArticle

A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram

by

Lijie Dong

,

Kailong Zhang

,

Mingyue He

,

Shenxin Zhong

and

Congjie Ou

^*

College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9808; https://doi.org/10.3390/app15179808

Submission received: 17 July 2025 / Revised: 31 August 2025 / Accepted: 5 September 2025 / Published: 7 September 2025

Download

Browse Figures

Versions Notes

Abstract

This paper proposes a new local thresholding method to further explore the relationship between gradients and image patterns. In most studies, the image gradient histogram is simply divided into K bins that have the same intervals in angular space. This kind of empirical approaches may not fully capture the correlation information between pixels. In this paper, a variance-based idea is applied to the gradient orientation histogram. It clusters pixels into subsets with different angular intervals. Analyzing these subsets with similar common patterns respectively will help to assist in achieving the optimal thresholds for image segmentation. For the result assessments, the proposed algorithm is compared with other 1-D and 2-D histogram-based thresholding methods, as well as hybrid local–global thresholding methods. It is shown that the proposed algorithm can effectively recognize the common features of the images that belong to the same category, and maintain the stable performances when the number of thresholds increases. Furthermore, the processing time of the present algorithm is competitive with those of other algorithms, which shows the potential application in real-time scenes.

Keywords:

local thresholding method; image segmentation; Otsu method; 2-D gradient orientation histogram

1. Introduction

Image processing has become an essential component in various applications, with image segmentation serving as a fundamental step, and has been extensively applied in numerous fields [1,2,3,4,5]. Technically, the threshold-based image segmentation has become the most frequently used method due to its simplicity, efficiency, and stability [6]. This approach is a classic segmentation technique that separates an image into foreground and background based on pixel intensity. Although it has been studied for decades and image segmentation today is dominated by deep learning, it remains a critical preprocessing technique in many state-of-art deep learning-based image processing tasks [7,8,9,10,11,12,13]. Thresholding methods for relevant pre-segmentation are widely used in various neural network-based image processing tasks. As exemplified by their use in (a) CT image segmentation to determine the tumor regions [7], (b) dermoscopic image segmentation to determine the infected lesion regions [8], (c) ultrasound image segmentation to segment the presence of follicles [10], (d) Geostationary Satellite (GOES) imagery for wildfire segmentation to calculate the ratio of fire pixels [12], and (e) cell segmentation, they split the microscopic images into segments [13]. In these applications, the quality of segmentation results directly impacts subsequent tasks. Therefore, further research on thresholding methods is essential.

Threshold-based segmentation can generally be categorized into global and local thresholding methods. The global methods apply a single or multiple thresholds to segment the entire image, offering computational efficiency and algorithmic simplicity. In 1985, Kapur [14] proposed the maximum Shannon entropy threshold method. This method selects an optimal threshold based on the image’s one-dimensional (1-D) gray-level histogram, and then segments it into foreground and background regions. Then it was extended to the Tsallis entropy by introducing the non-extensive parameter q [6]. Another typical global thresholding is the Otsu method [15], which determines the optimal threshold by maximizing the between-class variance of the foreground and background. Both approaches are automatic thresholding algorithms that are based on the 1-D gray-level histogram of an image. However, the 1-D histogram only counts the number of pixels that have the same gray-level values without considering the association information between different pixels, such as the spatial distribution information of pixels in the image. If the spatial correlation information between different pixels is nontrivial, the abovementioned global algorithms may not perform optimally [14]. To address the limitations of 1-D histogram, Abutaleb [16] proposed a two-dimensional (2-D) histogram in 1989, which incorporates both the gray values of pixels and their local average gray values. This approach extended Kapur’s method to 2-D histogram, determining the optimal threshold vector by maximizing the sum of the 2-D Shannon entropy of the foreground and background regions, particularly performing well in images with low signal-to-noise ratios (SNRs). Since then, many other 2-D histogram-based thresholding methods have been proposed by utilizing different objective functions [17,18,19,20,21]. However, the enhanced quality comes at the cost of increased computational complexity. The computation time grows exponentially as the number of thresholds increases. Furthermore, traditional 2-D entropy methods tend to recognize the pixels located on the main diagonal of the 2-D histogram, which may result in the loss of edge information [22].

While global thresholding techniques are efficient and straightforward, local techniques offer higher segmentation quality [23,24]. The local thresholding methods first divide the image into several independent regions that satisfy a given homogeneity criterion [25,26], and these regions are segmented by using region-specific thresholds subsequently. These methods incorporate pixel-wise correlation information [27,28,29], including backscatters and texture measures [30], statistical measures like mean intensity and standard deviation [31], and color and distance [32]. The local thresholding methods employ feature-based preprocessing prior to segmentation, and this hierarchical approach effectively reduces the segmentation error, enhances noise resistance, and achieves superior image quality.

Therefore, incorporating additional image feature information can significantly enhance segmentation quality. Among these features, texture information [33], which partially reflects edge characteristics and can be quantified through gradient-based measurements, is widely utilized in image processing applications [34,35,36,37,38,39,40,41,42,43,44,45,46]. Crucially, such edge information of an image captures the outlines, structures, and transitions of objects in the image. It typically corresponds to regions where the gray values change dramatically in the image, and plays an important role in separating different areas or objects. The extraction of texture information would be helpful to keep the predominant patterns of an image. Specifically, for texture-rich images, proper utilization of such information may play a crucial role in segmentation tasks. Therefore, a unified approach that can effectively act on different texture-rich image categories is important to reveal the relationships between texture patterns and pixel gradient clustering.

While grayscale gradient histograms have long been used in image segmentation, this practice is often based on empirical experience, and the implications of their distribution are rarely discussed. This paper analyzes the relationship between grayscale gradient orientation histogram and image patterns from a fundamental point of view, and verifies the reliability of this relationship using a simple method. This approach is expected to further understand the generating of digital images and provide beneficial support to prevailing image analysis algorithms, such as deep learning.

In this paper, we propose a local thresholding method that utilizes a gradient orientation histogram to extract local texture features from images. Different from separating the gradient orientation histogram into K bins [22,40], this method clusters pixels into local subsets by the variance of their orientations. Local thresholding is then applied to each subset. The remainder to this paper is organized as follows: Section 2 briefly introduces the pixel-wise gradient orientation and reviews the 1-D Otsu method. Section 3 presents the construction of the 2-D gradient orientation histogram and the process of the local thresholding method. Section 4 reports on the experimental results from two datasets and discussions. In Section 5, the conclusions are presented.

2. Related Works

2.1. Pixel-Wise Gradient Orientation

The pixel-wise gradient orientation [47,48,49,50] serves as an effective descriptor for local structural features, particularly in texture characterization. One of the ways to generate it is as follows: for a pixel located at position

(x, y)

, its gradient along the vertical and horizontal axes can be respectively defined as

\frac{\partial f (x, y)}{\partial x} = f (x + 1, y) - f (x - 1, y),

(1)

\frac{\partial f (x, y)}{\partial y} = f (x, y + 1) - f (x, y - 1),

(2)

where

f (x, y)

represents the gray-level value of pixel at

(x, y)

.

Then the orientation of this pixel can be calculated by

θ (x, y) = arctan (\frac{\partial f / \partial x}{\partial f / \partial y}),

(3)

where the domain of

θ (x, y)

is

[- π / 2, π / 2]

.

2.2. 1-D Otsu Method

Define the range of gray levels in a given image of size

M \times N

as

i = 0, 1, 2, \dots, L - 1

, where L represents the maximum gray-level of the image, such as 256. Then the gray-level histogram of the image can be computed by the probability distribution, as follows:

p_{i} = \frac{n_{i}}{M \times N}, p_{i} \geq 0, \sum_{i = 0}^{L - 1} p_{i} = 1,

(4)

where

n_{i}

is the number of pixels that gray-level values are equal to i.

Assuming that the gray-level histogram of the image is divided into two classes,

C_{0}

and

C_{1}

, by a threshold at level t, then the probabilities of class occurrences and the class mean gray-level values are given by

w_{0} = \sum_{i = 0}^{t} p_{i}, w_{1} = \sum_{i = t + 1}^{L - 1} p_{i},

(5)

and

μ_{0} = \sum_{i = 0}^{t} \frac{i p_{i}}{w_{0}}, μ_{1} = \sum_{i = t + 1}^{L - 1} \frac{i p_{i}}{w_{1}} .

(6)

Thus, the entire image’s mean gray-level value can be obtained as follows:

μ_{T} = \sum_{i = 0}^{L - 1} i p_{i} .

(7)

The between-class variance can be represented by

\begin{matrix} σ_{B}^{2} & = w_{0} {(μ_{0} - μ_{T})}^{2} + w_{1} {(μ_{1} - μ_{T})}^{2} \\ = w_{0} w_{1} {(μ_{0} - μ_{1})}^{2} \end{matrix} .

(8)

The optimal threshold can be obtained by maximizing Equation (8), i.e.,

t^{*} = \underset{0 \leq t \leq L - 1}{arg max} (σ_{B}^{2} (t)) .

(9)

3. Proposed Method

In order to enhance the stability, the 1-D thresholding method can be extended to 2-D cases [18,51,52]. However, traditional 2-D thresholding methods may cause the loss of a considerable number of pixels, leading to the loss of nontrivial edge information. Consequently, a novel algorithm is proposed to avoid such cases so that the segmentation quality can be improved. For a texture-rich image, the edge pattern is deeply related to the gradient of pixels’ gray level. Therefore, a 2-D gradient orientation histogram is adopted to cluster the image pixels into distinct regions. Additionally the local thresholding to each region demonstrates the advantages.

3.1. Construction of 2-D Gradient Orientation Histogram

For a pixel located at position

(x, y)

in a given image of size

M \times N

, its gray-level gradient orientation with respect to the horizontal axis is

θ (x, y) = R o u n d [D E G (arctan (\frac{\frac{\partial f}{\partial y}}{\frac{\partial f}{\partial x}}))],

(10)

where

D E G (μ)

denotes the transform of the radian angle

μ

to degree.

\frac{\partial f}{\partial x}

and

\frac{\partial f}{\partial y}

can be calculated by Equations (1) and (2), and the domain of

θ (x, y)

is

[- 90^{\circ}, 90^{\circ}]

. The pixel’s gray-level value,

f (x, y)

, and the gradient orientation

θ (x, y)

can be adopted to construct the 2-D gradient orientation histogram

h (m, n)

by

h (m, n) = Count (f (x, y) = m & θ (x, y) = n) .

(11)

The 2-D probability distribution is determined by the normalization of Equation (11), as follows:

p (m, n) = \hat{h} (m, n) = \frac{h (m, n)}{M \times N} .

(12)

Equation (12) represents the frequency of occurrences at gray level m and gradient orientation n, where

m = {0, 1, 2, \dots, L - 1}

and

n = {- 90^{\circ}, - 89^{\circ}, \dots, 89^{\circ}, 90^{\circ}}

. Figure 1 shows the gradient orientation information for the image ‘board’.

It is important to note that this study primarily designed the algorithm based on simulating real human visual perception, which has default horizontal and vertical directions. This approach is used to investigate the relationship between pixel gradient clustering and image texture patterns. Therefore, the impact of geometric image transformations on the algorithm was not comprehensively considered. The gradient orientation histogram would cyclically shift with the rotation of the image. Nevertheless, there exists a global optimal clustering by using the Otsu method. It is of interest to further explore this issue for relevant complex scenes in future research.

3.2. Main Step of Segmentation

For a digital image of size

M \times N

, the normalized 2-D gradient orientation histogram

p (m, n)

can be obtained via Equation (12). Different from traditional 2-D histogram, the distribution of

p (m, n)

is not concentrated at the main diagonal of the 2-D histogram, as it is shown in Figure 1. The probability distribution of each gradient orientation angle in the image can be obtained by

p_{k}^{o r i} = \sum_{i = 0}^{L - 1} p (i, k),

(13)

where k ranges from

[- 90, 90]

. Additionally, Figure 1d shows the 1-D distribution of gradient orientation angles for the image ‘board’. Using the Otsu method with a set of thresholds

\vec{t} = (t_{1}, t_{2}, \dots, t_{j})

, the distribution of Equation (13) can be segmented into

j + 1

distinct parts, denoted as

\vec{C} = (C_{1}, C_{2}, \dots, C_{j + 1})

, with each part comprising pixels that have similar gradient orientation angles. After normalization, the gray-level probability distributions of

j + 1

classes are respectively defined as

\begin{matrix} C_{1} : & \frac{p_{- 90}^{ori}}{P_{1}^{ori}}, \frac{p_{- 89}^{ori}}{P_{1}^{ori}}, \dots, \frac{p_{t_{1}}^{ori}}{P_{1}^{ori}}, \\ C_{2} : & \frac{p_{t_{1} + 1}^{ori}}{P_{2}^{ori}}, \frac{p_{t_{1} + 2}^{ori}}{P_{2}^{ori}}, \dots, \frac{p_{t_{2}}^{ori}}{P_{2}^{ori}}, \\ ⋮ \\ C_{j + 1} : & \frac{p_{t_{j} + 1}^{ori}}{P_{j + 1}^{ori}}, \frac{p_{t_{j} + 2}^{ori}}{P_{j + 1}^{ori}}, \dots, \frac{p_{90}^{ori}}{P_{j + 1}^{ori}} \end{matrix}

(14)

where the cumulative probabilities of

j + 1

classes are defined as

\begin{matrix} P_{1}^{ori} & = \sum_{k = - 90}^{t_{1}} p_{k}^{ori}, \\ P_{2}^{ori} & = \sum_{k = t_{1} + 1}^{t_{2}} p_{k}^{ori}, \\ ⋮ \\ P_{j + 1}^{ori} & = \sum_{k = t_{j} + 1}^{90} p_{k}^{ori} \end{matrix}

(15)

The mean gradient orientation angles of

j + 1

classes are as follows:

\begin{matrix} μ_{1}^{ori} & = \sum_{k = - 90}^{t_{1}} \frac{k p_{k}^{ori}}{P_{1}^{ori}}, \\ μ_{2}^{ori} & = \sum_{k = t_{1} + 1}^{t_{2}} \frac{k p_{k}^{ori}}{P_{2}^{ori}}, \\ ⋮ \\ μ_{j + 1}^{ori} & = \sum_{k = t_{j} + 1}^{90} \frac{k p_{k}^{ori}}{P_{j + 1}^{ori}} \end{matrix}

(16)

The entire image’s mean gradient orientation angle can be obtained by

μ_{T}^{o r i} = \sum_{k = - 90}^{90} k p_{k}^{o r i} .

(17)

The between-class variance of orientation can be represented by

\hat{σ_{B}^{2}} (t_{1}, t_{2}, \dots, t_{j}) = P_{1}^{o r i} {(μ_{1}^{o r i} - μ_{T}^{o r i})}^{2} + P_{2}^{o r i} {(μ_{2}^{o r i} - μ_{T}^{o r i})}^{2} + \dots + P_{j + 1}^{o r i} {(μ_{j + 1}^{o r i} - μ_{T}^{o r i})}^{2} .

(18)

Maximizing the objective function

\hat{σ_{B}^{2}} (t_{1}, t_{2}, \dots, t_{j})

yields the optimal set of thresholds as follows:

{(\vec{t})}^{*} = \underset{- 90 \leq t_{1}, t_{2}, \dots, t_{j} \leq 90}{arg max} {\hat{σ_{B}^{2}} (t_{1}, t_{2}, \dots, t_{j})} .

(19)

Using the optimal threshold

{(\vec{t})}^{*}

, the 2-D gradient orientation histogram

p (m, n)

is clustered into

j + 1

parts with the largest between-class variance. Each part contains pixels with similar gradient orientation information. Figure 2 shows an example; with the optimal vector

{(\vec{t})}^{*} = [- 35, 34]

, the pixels of each gradient orientation part can be sorted by gray-level value and yield the corresponding histogram. It indicates that pixels within these clustered regions exhibit similar texture patterns. Processing these regions separately may enhance feature recognition and improve segmentation quality.

After clustering the image, the Otsu method mentioned in Section 2.2 is used to perform local thresholding segmentation on these

j + 1

regions. These

j + 1

segmentation results are then combined to obtain the final segmentation result. Figure 3 illustrates the overall segmentation process.

The pseudocode for the proposed algorithm is presented in Algorithm 1.

Algorithm 1 Image Segmentation based on Gradient Orientation Histogram

Input: A grayscale image f.

Output: A final segmented image

S_{f i n a l}

.

1:: procedure SEGMENTATION BASED ON PIXEL GRADIENT CLUSTERING(f)
2:: ▹Step 1: Construct and normalize the 2-D gradient orientation histogram
3:: Calculate the 2-D gradient orientation histogram $h (m, n)$ of the image f, where m and n respectively represent gray-level value and gradient orientation of pixel located at position $(x, y)$ .
4:: Obtain the probability distribution $p (m, n)$ by normalizing the histogram: $p (m, n) = h (m, n) / \sum_{m, n} h (m, n)$ .
5:: ▹Step 2: Calculate the 1-D gradient orientation histogram
6:: Calculate the 1-D gradient orientation angles histogram $p_{k}^{ori}$ by accumulating the 2-D histogram along the gradient orientation dimension: $p_{k}^{ori} \leftarrow \sum_{m} p (m, k)$ .
7:: ▹Step 3: Cluster the image into regions based on gradient orientation
8:: Segment the distribution $p_{k}^{ori}$ into $j + 1$ distinct parts by Equation (19).
9:: Cluster the image into $j + 1$ regions $R = {R_{1}, \dots, R_{j + 1}}$ , where each region contains pixels with similar gradient orientation angles.
10:: ▹Step 4: Segment each region using thresholds
11:: Initialize a set of local segmentation results $S_{l o c a l} \leftarrow \emptyset$ .
12:: for all region $R_{i} \in R$ do
13:: Obtain the gray-level histogram $h_{i}$ for region $R_{i}$ .
14:: Use the Otsu method to segment $R_{i}$ based on $h_{i}$ to obtain a local segmentation result $S_{i}$ .
15:: Add $S_{i}$ to $S_{l o c a l}$ .
16:: end for
17:: ▹Step 5: Splice local segmentation results to obtain the final result
18:: Splice all local segmentation results in $S_{l o c a l}$ to obtain the final segmentation image $S_{f i n a l}$ .
19:: return $S_{f i n a l}$
20:: end procedure

3.3. Comparison of Existing Similar Methods

Although the definitions of pixels’ gray-level gradient orientation in Table 1 are similar, the applications of such information in image processing are quite different. More specifically, HOG is always divided into several bins that have the same intervals in angular space [22,38,40], while in the present work, the pixels are clustered into distinct subsets with different angular intervals by using the Otsu criterion. The relationships between images’ visual patterns and the variance of gradient orientation can be further studied.

3.4. Image Test Sets and Quality Evaluation Parameters

Image texture not only characterizes local structural patterns but also effectively captures inter-object boundary information. Due to its important role in object recognition, texture features have been widely employed in image analysis. To further show the relationship between texture information and image segmentation, images with diverse texture characteristics are adopted in the experiments. Figure 4 displays three images with obvious texture features from the image library of MATLAB2015(8.6.0.267246), with resolutions of (a)

306 \times 648

, (b)

189 \times 250

, and (c)

232 \times 205

.

Besides Figure 4, more real-world images from various scenes are essential to test the proposed algorithm. The Describable Textures Dataset (DTD, R1.0.1) [53] is an image dataset consisting of 47 different categories of real-world texture images, with each category named by an adjective. It is designed as a public benchmark and can be used to study the problem of extracting semantic properties of textures and patterns. The DTD dataset is available at [54]. In this paper, the segmentation experiment employs 81 ‘woven’, 119 ‘lacelike’, 107 ‘pitted’, and 119 ‘blotchy’ images. Here are a few examples from this dataset (Figure 5).

To evaluate the effectiveness of the local thresholding method proposed above, we adopt PSNR (Peak Signal-to-Noise Ratio) and FSIM (Feature Similarity Index) as quality indices. PSNR [55] represents the ratio of the peak signal to the noise. As it can precisely measure the difference between the input image and the output image, it is currently the most frequently used objective metric for evaluating image quality. A higher PSNR value indicates less distortion in the output image and better segmentation quality. It is defined as

P S N R = 10 {log}_{10} [\frac{{(L - 1)}^{2}}{M S E}]

(20)

where

M S E

is the mean squared error between the input image and the output image, and L is the maximum gray-level value in the image, usually 255.

M S E

can be written as

M S E = \frac{1}{M \times N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} {[f (i, j) - g (i, j)]}^{2}

(21)

where

M \times N

is the size of the image, and

f (i, j)

and

g (i, j)

represent the input image and the output image, respectively.

FSIM [56] is proposed based on the principle that the human visual system understands an image mainly according to its low-level features. It is utilized to measure the similarity between two images based on phase congruence (PC) and image gradient magnitude (GM). Here, PC serves as a dimensionless indicator of the importance of local structures and is used as the primary feature in FSIM. GM acts as the secondary feature in FSIM. Both features play complementary roles in characterizing the local quality of an image. FSIM is defined as

F S I M = \frac{\sum_{x \in Ω} S_{L} (x) \cdot P C_{m} (x)}{\sum_{x \in Ω} P C_{m} (x)}

(22)

where

Ω

means the whole image spatial domain,

S_{L} (x)

is the local similarity at pixel x, and

{P C}_{m} (x)

is the maximum PC value at pixel x. Please refer to [56] for more details about

S_{L} (x)

and

{P C}_{m} (x)

.

4. Experimental Results and Discussions

In Figure 2, the image ‘board’ is clustered into three parts by optimized vector

t^{*} = [- 35, 34]

. It is worth to note that the segmentation results of the proposed method are quite stable with the variation of clustering. According to Equation (19), the image can be clustered into an arbitrary number of parts, i.e.,

j = 2, 3, 4, \dots

. Each part is segmented by using the Otsu method. Table 2 lists the segmentation results of the image ‘board’, ‘bag’, ‘tire’, ‘woven_0118’, ‘lacelike_0053’, ‘pitted_0009’, and ‘blotchy_0085’ with different numbers of clustering parts.

It is shown that the variations of PSNR and FSIM for the abovementioned images are all very small when j increases. Without loss of generality, all images are clustered into three parts, i.e.,

j = 2

, in the following experiments.

To demonstrate the performance of the proposed method, we tested images from a MATLAB image library (Figure 4) and the DTD dataset (Figure 5). The Kapur method [57], 1-D Otsu [58], 2-D Otsu [51], QOMVO-based Tsallis entropy (

q = 0.8

) [59], self-adaptive Tsallis entropy [6], ACM-Tsallis [27], and the proposed method were applied to segment these images, with the numbers of thresholds set to 2, 3, 4, and 5.

4.1. Algorithm Performance and Computational Cost

Figure 6 demonstrates the two-level thresholding performances for three images of Figure 4. It is found that the proposed method keeps more details than others, and can avoid a lot of mis-segmentations. Notably, this advantage keeps with the increasing number of thresholds and can be quantitatively illustrated. Table 3 presents the PSNR and FSIM results at different threshold levels for these three images. In each row, the maximum PSNR and FSIM values (with bold font) indicate that the corresponding method is the most suitable one for the image listed at the beginning of the row. The proposed method achieved the best PSNR and FSIM values for all three texture-rich images across all threshold levels. It performs better than 1-D histogram-based methods because it incorporates more pixel-wise correlation information for segmentation. Meanwhile, traditional 2-D methods neglect off-diagonal pixels’ contributions, leading to image information loss. In contrast, the proposed method utilizes gradient information from all pixels for segmentation, thereby achieving superior performance. This also indicates that, compared with the other two 2-D histogram-based methods, the gradient orientation histogram is more effective in analyzing the texture features of these images.

Besides QOMVO (1-D), ACM-Tsallis (2-D), and Adaptive (1-D), the other four algorithms are all performed in exhaustive calculation to ensure the accuracy. Table 4 presents the comparative computational time costs across these four methods. All of them were implemented in Python and executed on a workstation equipped with dual Intel Xeon Gold 6268CL processors, (Santa Clara, CA, USA) (2.80 GHz, 28 cores total) and 256 GB DDR4 RAM (2666 MHz). To ensure fair comparison, all methods were restricted to single-thread execution.

It is shown that the proposed method obtained better segmentation results while requiring less computational time than the 2-D Otsu method. It should be noted that the proposed method can adopt multiprocessing computation for segmenting the three clustered regions. Therefore, compared with traditional 1-D methods, it only incurs additional computational cost for constructing 2-D gradient orientation histogram and the corresponding pixels’ clustering. It is worth to mention that the computation time of the proposed method in Table 4 is based on

j = 2

, and this value will increase exponentially as j increases. Nevertheless, as indicated by the results in Table 2, the increase in j contributes trivially to the quality indices. Therefore,

j = 2

achieves both stable segmentation quality improvement and low computational consumption, showing the potential application in real-time scenes.

4.2. Consistency of Algorithm Performance with Increasing Threshold

To further validate the effectiveness of the proposed method, we applied the above seven methods to perform two-level, three-level, four-level, and five-level thresholding on images from the DTD dataset mentioned in Section 3.4.

Figure 7 shows parts of the two-level thresholding results of some images. It is evident that, for these four images, both 1-D histogram-based methods and 2-D methods yielded unsatisfactory segmentation results. Specifically, the 1-D Kapur method and three Tsallis-entropy-based methods failed to capture the texture details, leading to difficulties in segmenting and preserving complex and diverse texture lines in the original images. The traditional 2-D method only considers pixels along the main diagonal of the histogram, which may result in the loss of important texture information. In comparison, the proposed method effectively preserved the texture lines and features from the original images.

In order to show the performances of these seven algorithms objectively, different categories of the DTD dataset are adopted to yield the comparisons. For example, the category named ’woven’ contains 81 images with similar texture structures. At a given number of thresholds, each image yields seven PSNR values by above seven algorithms. The algorithm corresponding to the maximum PSNR value is considered as the optimal processing method for the given image. Then the count of this optimal algorithm increases by 1 accordingly. After all the test images are processed, the distribution of optimal PSNR among seven algorithms can be obtained. It should be mentioned that two algorithms may occasionally reach the same maximum results. Therefore, the total count of seven algorithms may slightly exceed the total number of images in the category. In the same way, the distributions of FSIM can be obtained. Table 5 presents the distributions of optimal PSNR and FSIM among four algorithms by using different image categories. For each image category, the values in bold represent the highest frequencies of reaching the optimal segmentation quality, which shows the superiority of the corresponding algorithm.

For the ‘woven’ category, the proposed method achieved optimal PSNR values over all 81 images at threshold levels 2 and 3, and obtained optimal FSIM values in approximately

63 %

of cases. This demonstrates that the method maintains both high segmentation quality and effective preservation of texture details. With the increasing number of thresholds, such as four and five, the optimal rate of PSNR for the proposed algorithm is still dominant, and the corresponding optimal rate of FSIM slightly increases. This suggests that the proposed algorithm is stable in both segmentation quality and feature preservation for this image category with increasing threshold numbers.

In order to show the effectiveness of the proposed algorithm, more image categories can be involved. For the ‘lacelike’ category, the optimal PSNR rate of the present algorithm is nearly

100 %

threshold levels 2, 3, and 5, demonstrating exceptional overall performance. For the ‘pitted’ category, the optimal PSNR rates remain above

92 %

, and those yielded by different threshold levels remain unchanged within a small interval of fluctuation. For the ‘blotchy’ category, the proposed method still outperforms other methods. From Figure 5, we can see that the texture patterns among four image categories are quite different from each other. The proposed method reached an average optimal PSNR rate of up to

96 %

, showing its powerful ability in texture pattern recognition. Regarding the optimal FSIM rate, the proposed method also remains far larger than those of the other typical algorithms. This superiority is unchanged in different image categories, with different threshold levels.

Actually, the superiority of the proposed algorithm benefits from the pre-segmentation clustering based on local texture features, which closely depends on the pixels’ gradient orientation distribution. Although the four texture categories exhibit distinct visual characteristics, the proposed method consistently obtains satisfactory PSNR values across all categories. This shows that it effectively identifies texture patterns across diverse image categories, demonstrating strong scalability. FSIM evaluation shows that the proposed method has superior capability in preserving original image features across all categories. This validates that pixel gradient clustering contributes to analyzing complex texture distributions in different images. Furthermore, the method retains relatively stable performance with overall small fluctuations across different threshold levels, indicating good robustness and stability.

4.3. Comparative Analysis of Method Performance

The two 2-D methods yield unsatisfactory performance across all texture categories, with near-zero counts of images achieving maximum PSNR values at all threshold levels. This likely stems from their threshold determination mechanism disregarding certain pixels, leading to significant loss of edge and feature information. Interestingly, while the 1-D Kapur method showed similarly poor PSNR results as two 2-D methods, it outperforms the 2-D Otsu method in the FSIM index but is lower than the other. This can be attributed to its global threshold selection based on all image pixels, which prevents substantial feature loss and consequently preserves more texture characteristics in some images. Additionally, the ACM-Tsallis method is better than these two traditional classical algorithms. However, both 1-D Tsallis-entropy-based methods yield non-ideal performance, indicating that non-extensive entropy is not suitable for texture-rich images. The 1-D Otsu method achieved relatively better results for a small subset of images in both PSNR and FSIM. This may reflect better compatibility between the between-class variance maximization criterion and these texture images.

Clearly, traditional 1-D methods show limited performance for these texture-rich images. They fail to account for spatial correlations between pixels and utilize insufficient image information during threshold selection. However, 2-D methods do not appear to perform significantly better, since the traditional 2-D histogram-based thresholding methods ignored pixels with high contrast to their surroundings. It is more suitable for low SNR scenarios. Consequently, its ability to extract texture information is weak, leading to suboptimal segmentation results.

The superior performance of the proposed method shows the importance of texture information in segmentation tasks. It also confirms that gradient orientation histogram serves as an effective descriptor for image texture patterns. This strongly validates the necessity of clustering pixels with similar gradient orientations for multi-level image segmentation.

In general, clustering pixels by their gradient orientations is beneficial to recognize an image in computer vision. For example, traditional 1-D gray level histogram cannot distinguish two pixels with the same gray-level value. However, their gradient orientations should be very different if they belong to two different image patterns. In order to verify the effectiveness of the proposed method, images other than a texture-rich one, such as medical images [60], high-resolution overhead baseball diamond images [61], BSDS0500 images [62], and TID2008 images [63], are also involved in our experiments. Based on the statistical results of more than 100 arbitrary selected samples, it is found that the proposed method still outperforms the other six methods mentioned above.

5. Conclusions

In image segmentation, 2-D thresholding methods extract additional pixel-wise features, thereby overcoming the limitations of 1-D approaches in certain scenarios. This demonstrates that proper utilization of additional image features can effectively improve segmentation quality. However, traditional 2-D entropy thresholding methods may inadvertently discard a significant proportion of pixels, thereby undermining the preservation of nontrivial edge details.

In this paper, we propose a new local thresholding algorithm to further study the importance of texture features in improving multi-level segmentation quality. To evaluate the performances, we compare our method with the Kapur method, the Otsu method, and the Tsallis entropy-based methods. Segmentation results are assessed by the PSNR index. Experimental results demonstrate that our algorithm can precisely identify common patterns in texture-rich images. Specifically, when segmenting four texture categories of images with distinct characteristics from the DTD dataset, our method achieves significantly higher segmentation quality than other algorithms. This advantage stems from clustering pixels with similar gray-level gradient orientation before segmentation, which facilitates the algorithm’s understanding and analysis of feature distribution patterns in images. Additionally, the segmentation quality remains stable as the threshold level increases. Our method also shows superior performance in the FSIM index, proving that the gradient orientation histogram effectively captures texture information across different images. It can identify diverse texture patterns and periodical structures, preserving complex texture features in segmentation results. The performance sustains stability even as threshold levels increase, highlighting strong scalability. Notably, the improved performance comes with negligible computational cost, demonstrating potential for real-time applications.

A multi-level image segmentation technique plays an important role in image preprocessing and further affects the performances of deep learning-based image processing tasks. Therefore, a threshold segmentation method with superior performance and stability can provide great help for these related tasks. In this study, the common features of images from the same category can be effectively recognized by gradient orientation clustering. Additionally, the qualities of multi-thresholding have been improved essentially. Nevertheless, this paper only explores the relationship between gradients and image patterns from a fundamental point. For pattern recognition tasks in more complex scenarios, it is necessary to combine more modern techniques such as deep learning. Therefore, in future work, it is of interest to combine the present method with the proper deep learning model and examine the effectiveness in the general scope of image categories.

Author Contributions

Conceptualization, C.O. and L.D.; methodology, C.O.; software, L.D.; validation, L.D. and K.Z.; formal analysis, C.O. and L.D.; investigation, L.D. and S.Z.; resources, M.H.; data curation, L.D.; writing—original draft preparation, L.D.; writing—review and editing, C.O.; visualization, L.D.; supervision, C.O. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to thank the support by the National Natural Science Foundation of China (No. 11775084), the Program for Prominent Talents in Fujian Province, and the Scientific Research Foundation for the Returned Overseas Chinese Scholars.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to thank http://www.robots.ox.ac.uk/vgg/data/dtd/ (accessed on 4 August 2024) for providing source images.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gharehchopogh, F.S.; Ibrikci, T. An improved African vultures optimization algorithm using different fitness functions for multi-level thresholding image segmentation. Multimed. Tools Appl. 2024, 83, 16929–16975. [Google Scholar] [CrossRef]
Su, H.; Zhao, D.; Elmannai, H.; Heidari, A.A.; Bourouis, S.; Wu, Z.; Cai, Z.; Gui, W.; Chen, M. Multilevel threshold image segmentation for COVID-19 chest radiography: A framework using horizontal and vertical multiverse optimization. Comput. Biol. Med. 2022, 146, 105618. [Google Scholar] [CrossRef]
Houssein, E.H.; Abdelkareem, D.A.; Emam, M.M.; Hameed, M.A.; Younan, M. An efficient image segmentation method for skin cancer imaging using improved golden jackal optimization algorithm. Comput. Biol. Med. 2022, 149, 106075. [Google Scholar] [CrossRef]
Liu, Q.; Li, N.; Jia, H.; Qi, Q.; Abualigah, L. Modified remora optimization algorithm for global optimization and multilevel thresholding image segmentation. Mathematics 2022, 10, 1014. [Google Scholar] [CrossRef]
Zarate, O.; Hinojosa, S.; Ortiz-Joachin, D. Improving Prostate Image Segmentation Based on Equilibrium Optimizer and Cross-Entropy. Appl. Sci. 2024, 14, 9785. [Google Scholar] [CrossRef]
Zhang, K.; He, M.; Dong, L.; Ou, C. The Application of Tsallis Entropy Based Self-Adaptive Algorithm for Multi-Threshold Image Segmentation. Entropy 2024, 26, 777. [Google Scholar] [CrossRef]
Vaiyapuri, T.; Dutta, A.K.; Punithavathi, I.S.H.; Duraipandy, P.; Alotaibi, S.S.; Alsolai, H.; Mohamed, A.; Mahgoub, H. Intelligent Deep-Learning-Enabled Decision-Making Medical System for Pancreatic Tumor Classification on CT Images. Healthcare 2022, 10, 677. [Google Scholar] [CrossRef] [PubMed]
Reshma, G.; Al-Atroshi, C.; Nassa, V.K.; Geetha, B.; Sunitha, G.; Galety, M.G.; Neelakandan, S. Deep Learning-Based Skin Lesion Diagnosis Model Using Dermoscopic Images. Intell. Autom. Soft Comput. 2022, 31, 621–634. [Google Scholar] [CrossRef]
Razmjooy, N.; Arshaghi, A. Application of Multilevel Thresholding and CNN for the Diagnosis of Skin Cancer Utilizing a Multi-Agent Fuzzy Buzzard Algorithm. Biomed. Signal Process. Control 2023, 84, 104984. [Google Scholar] [CrossRef]
Gopalakrishnan, C.; Iyapparaja, M. Multilevel thresholding based follicle detection and classification of polycystic ovary syndrome from the ultrasound images using machine learning. Int. J. Syst. Assur. Eng. Manag. 2021, 1–8. [Google Scholar] [CrossRef]
Kavitha, T.; Mathai, P.P.; Karthikeyan, C.; Ashok, M.; Kohar, R.; Avanija, J.; Neelakandan, S. Deep learning based capsule neural network model for breast cancer diagnosis using mammogram images. Interdiscip. Sci. Comput. Life Sci. 2022, 14, 113–129. [Google Scholar] [CrossRef]
Badhan, M.; Shamsaei, K.; Ebrahimian, H.; Bebis, G.; Lareau, N.P.; Rowell, E. Deep Learning Approach to Improve Spatial Resolution of GOES-17 Wildfire Boundaries Using VIIRS Satellite Data. Remote Sens. 2024, 16, 715. [Google Scholar] [CrossRef]
Ramaswamy, R.K.; Naresh, P.; Nagesh, C.; Balan, S.K. Multilevel thresholding technique with Archery Gold Rush Optimization and PCNN-based childhood medulloblastoma classification using microscopic images. Biomed. Signal Process. Control 2025, 107, 107801. [Google Scholar] [CrossRef]
Kapur, J.N.; Sahoo, P.K.; Wong, A.K. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vision, Graph. Image Process. 1985, 29, 273–285. [Google Scholar] [CrossRef]
Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
Abutaleb, A.S. Automatic thresholding of gray-level pictures using two-dimensional entropy. Comput. Vis. Graph. Image Process. 1989, 47, 22–32. [Google Scholar] [CrossRef]
Pal, N.R.; Pal, S.K. Entropic thresholding. Signal Process. 1989, 16, 97–108. [Google Scholar] [CrossRef]
Ning, G. Two-dimensional Otsu multi-threshold image segmentation based on hybrid whale optimization algorithm. Multimed. Tools Appl. 2023, 82, 15007–15026. [Google Scholar] [CrossRef]
Sahoo, P.K.; Arora, G. A thresholding method based on two-dimensional Renyi’s entropy. Pattern Recognit. 2004, 37, 1149–1161. [Google Scholar] [CrossRef]
Wang, Q.; Chi, Z.; Zhao, R. Image thresholding by maximizing the index of nonfuzziness of the 2-D grayscale histogram. Comput. Vis. Image Underst. 2002, 85, 100–116. [Google Scholar] [CrossRef]
Naik, M.K.; Panda, R.; Wunnava, A.; Jena, B.; Abraham, A. A leader Harris hawks optimization for 2-D Masi entropy-based multilevel image thresholding. Multimed. Tools Appl. 2021, 80, 35543–35583. [Google Scholar] [CrossRef]
Yimit, A.; Hagihara, Y.; Miyoshi, T.; Hagihara, Y. 2-D direction histogram based entropic thresholding. Neurocomputing 2013, 120, 287–297. [Google Scholar] [CrossRef]
Senthilkumaran, N.; Vaithegi, S. Image segmentation by using thresholding techniques for medical images. Comput. Sci. Eng. Int. J. 2016, 6, 1–13. [Google Scholar] [CrossRef]
Jiang, X.; Mojon, D. Adaptive local thresholding by verification-based multithreshold probing with application to vessel detection in retinal images. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 131–137. [Google Scholar] [CrossRef]
Li, Y.; Li, Z.; Guo, Z.; Siddique, A.; Liu, Y.; Yu, K. Infrared Small Target Detection Based on Adaptive Region Growing Algorithm with Iterative Threshold Analysis. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5003715. [Google Scholar] [CrossRef]
Jardim, S.; António, J.; Mora, C. Image thresholding approaches for medical image segmentation-short literature review. Procedia Comput. Sci. 2023, 219, 1485–1492. [Google Scholar] [CrossRef]
Kandhway, P. A novel adaptive contextual information-based 2D-histogram for image thresholding. Expert Syst. Appl. 2024, 238, 122026. [Google Scholar] [CrossRef]
Vijayalakshmi, D.; Nath, M.K. A strategic approach towards contrast enhancement by two-dimensional histogram equalization based on total variational decomposition. Multimed. Tools Appl. 2023, 82, 19247–19274. [Google Scholar] [CrossRef]
Yang, W.; Cai, L.; Wu, F. Image segmentation based on gray level and local relative entropy two dimensional histogram. PLoS ONE 2020, 15, e0229651. [Google Scholar] [CrossRef]
Liang, J.; Liu, D. A local thresholding approach to flood water delineation using Sentinel-1 SAR imagery. ISPRS J. Photogramm. Remote Sens. 2020, 159, 53–62. [Google Scholar] [CrossRef]
Zhang, M.; Wang, J.; Cao, X.; Xu, X.; Zhou, J.; Chen, H. An integrated global and local thresholding method for segmenting blood vessels in angiography. Heliyon 2024, 10, e38579. [Google Scholar] [CrossRef]
Niu, Y.; Song, J.; Zou, L.; Yan, Z.; Lin, X. Cloud detection method using ground-based sky images based on clear sky library and superpixel local threshold. Renew. Energy 2024, 226, 120452. [Google Scholar] [CrossRef]
Tan, L.; Liu, Y.; Zhou, K.; Zhang, R.; Li, J.; Yan, R. Optimization of DG-LRG Water Extraction Algorithm Considering Polarization and Texture Information. Appl. Sci. 2025, 15, 4434. [Google Scholar] [CrossRef]
Cao, X.; Zuo, M.; Chen, G.; Wu, X.; Wang, P.; Liu, Y. Visual Localization Method for Fastener-Nut Disassembly and Assembly Robot Based on Improved Canny and HOG-SED. Appl. Sci. 2025, 15, 1645. [Google Scholar] [CrossRef]
Hong, X.; Chen, G.; Chen, Y.; Cai, R. Research on Abnormal Ship Brightness Temperature Detection Based on Infrared Image Edge-Enhanced Segmentation Network. Appl. Sci. 2025, 15, 3551. [Google Scholar] [CrossRef]
Chung, C.T.; Ying, J.J.C. Seg-Eigen-CAM: Eigen-Value-Based Visual Explanations for Semantic Segmentation Models. Appl. Sci. 2025, 15, 7562. [Google Scholar] [CrossRef]
Wang, W.; Chen, J.; Hong, Z. Multiscale Eight Direction Descriptor-Based Improved SAR–SIFT Method for Along-Track and Cross-Track SAR Images. Appl. Sci. 2025, 15, 7721. [Google Scholar] [CrossRef]
Hosseini-Fard, E.; Roshandel-Kahoo, A.; Soleimani-Monfared, M.; Khayer, K.; Ahmadi-Fard, A.R. Automatic seismic image segmentation by introducing a novel strategy in histogram of oriented gradients. J. Pet. Sci. Eng. 2022, 209, 109971. [Google Scholar] [CrossRef]
Bhattarai, B.; Subedi, R.; Gaire, R.R.; Vazquez, E.; Stoyanov, D. Histogram of oriented gradients meet deep learning: A novel multi-task deep network for 2D surgical image semantic segmentation. Med. Image Anal. 2023, 85, 102747. [Google Scholar] [CrossRef]
Sun, Z.; Caetano, E.; Pereira, S.; Moutinho, C. Employing histogram of oriented gradient to enhance concrete crack detection performance with classification algorithm and Bayesian optimization. Eng. Fail. Anal. 2023, 150, 107351. [Google Scholar] [CrossRef]
Wang, B.; Si, S.; Zhao, H.; Zhu, H.; Dou, S. False positive reduction in pulmonary nodule classification using 3D texture and edge feature in CT images. Technol. Health Care 2021, 29, 1071–1088. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Gao, X.; Wang, F.; Ji, Z.; Hu, X. Feature point matching method based on consistent edge structures for infrared and visible images. Appl. Sci. 2020, 10, 2302. [Google Scholar] [CrossRef]
Zhao, Z.; Wang, F.; You, H. Robust region feature extraction with salient mser and segment distance-weighted gloh for remote sensing image registration. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2475–2488. [Google Scholar] [CrossRef]
Tao, H.; Lu, X. Smoke vehicle detection based on multi-feature fusion and hidden Markov model. J. Real-Time Image Process. 2020, 17, 745–758. [Google Scholar] [CrossRef]
Liu, Y.; Fan, Y.; Feng, H.; Chen, R.; Bian, M.; Ma, Y.; Yue, J.; Yang, G. Estimating potato above-ground biomass based on vegetation indices and texture features constructed from sensitive bands of UAV hyperspectral imagery. Comput. Electron. Agric. 2024, 220, 108918. [Google Scholar] [CrossRef]
Chai, X.; Song, S.; Gan, Z.; Long, G.; Tian, Y.; He, X. CSENMT: A deep image compressed sensing encryption network via multi-color space and texture feature. Expert Syst. Appl. 2024, 241, 122562. [Google Scholar] [CrossRef]
Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Liang, D.; Ding, J.; Zhang, Y. Efficient Multisource Remote Sensing Image Matching Using Dominant Orientation of Gradient. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2194–2205. [Google Scholar] [CrossRef]
Xu, W.; Zhong, S.; Yan, W.L. A New Orientation Estimation Method Based on Rotation Invariant Gradient for Feature Points. IEEE Geosci. Remote Sens. Lett. 2021, 18, 791–795. [Google Scholar] [CrossRef]
Wan, G.; Ye, Z.; Xu, Y.; Huang, R.; Zhou, Y.; Xie, H.; Tong, X. Multimodal Remote Sensing Image Matching Based on Weighted Structure Saliency Feature. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4700816. [Google Scholar] [CrossRef]
Zhang, J.; Hu, J. Image Segmentation Based on 2D Otsu Method with Histogram Analysis. In Proceedings of the 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, 12–14 December 2008; Volume 6, pp. 105–108. [Google Scholar]
Du, Y.; Yuan, H.; Jia, K.; Li, F. Research on Threshold Segmentation Method of Two-Dimensional Otsu Image Based on Improved Sparrow Search Algorithm. IEEE Access 2023, 11, 70459–70469. [Google Scholar] [CrossRef]
Cimpoi, M.; Maji, S.; Kokkinos, I.; Mohamed, S.; Vedaldi, A. Describing Textures in the Wild. arXiv 2013, arXiv:1311.3618. [Google Scholar] [CrossRef]
Available online: https://www.robots.ox.ac.uk/~vgg/data/dtd/ (accessed on 4 August 2024).
Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef]
Seyyedabbasi, A. A Hybrid Multi-Strategy Optimization Metaheuristic Algorithm for Multi-Level Thresholding Color Image Segmentation. Appl. Sci. 2025, 15, 7255. [Google Scholar] [CrossRef]
Liao, P.S.; Chen, T.S.; Chung, P. A Fast Algorithm for Multilevel Thresholding. J. Inf. Sci. Eng. 2001, 17, 713–727. [Google Scholar]
Chouksey, M.; Jha, R.K. A joint entropy for image segmentation based on quasi opposite multiverse optimization. Multimed. Tools Appl. 2021, 80, 10037–10074. [Google Scholar] [CrossRef]
Codella, N.C.F.; Gutman, D.; Celebi, M.E.; Helba, B.; Marchetti, M.A.; Dusza, S.W.; Kalloo, A.; Liopyris, K.; Mishra, N.; Kittler, H.; et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; pp. 168–172. [Google Scholar] [CrossRef]
Yang, Y.; Newsam, S. Bag-of-visual-words and spatial extensions for land-use classification. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS ’10, New York, NY, USA, 2–5 November 2010; pp. 270–279. [Google Scholar] [CrossRef]
Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour Detection and Hierarchical Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916. [Google Scholar] [CrossRef]
Ponomarenko, N.; Lukin, V.; Zelensky, A.; Egiazarian, K.; Carli, M.; Battisti, F. TID2008-a database for evaluation of full-reference visual quality assessment metrics. Adv. Mod. Radioelectron. 2009, 10, 30–45. [Google Scholar]

Figure 1. (a) The original image of ‘board’, (b) the gray-level histogram of (a), (c) the 2-D gradient orientation histogram of (a), and (d) the 1-D gradient orientation histogram of (a).

Figure 2. Clustering results of the ‘board’ image when

j = 2

: (a) pixels of Figure 1a when

- 90 \leq k \leq - 35

, (d) pixels of Figure 1a when

- 34 \leq k \leq - 34

, (g) pixels of Figure 1a when

35 \leq k \leq 90

. (b,e,h) are the corresponding gray-level histograms of (a,d,g), respectively. (c,f,i) are the corresponding 1-D gradient orientation histograms of (a,d,g), respectively.

Figure 2. Clustering results of the ‘board’ image when

j = 2

: (a) pixels of Figure 1a when

- 90 \leq k \leq - 35

, (d) pixels of Figure 1a when

- 34 \leq k \leq - 34

, (g) pixels of Figure 1a when

35 \leq k \leq 90

. (b,e,h) are the corresponding gray-level histograms of (a,d,g), respectively. (c,f,i) are the corresponding 1-D gradient orientation histograms of (a,d,g), respectively.

Figure 3. The flow chart of local multi-thresholding based on the 2-D gradient orientation histogram.

Figure 4. Sample images from the MATLAB image library: (a) ‘board’, (b) ‘bag’, and (c) ‘tire’.

Figure 5. Example images from 4 categories of the DTD image dataset.

Figure 6. The two-level thresholding results of Figure 4 by using 7 different algorithms.

Figure 7. The two-level thresholding results of the typical images from Figure 5 by using 7 different algorithms.

Table 1. Summary of methods utilizing gradient orientation in recent works.

Reference	Definition of Gradient Orientation	Usage of Gradient Orientation
[22]	Defines the gradient orientation as $θ (x, y) = arctan (\frac{\partial f}{\partial x} / \frac{\partial f}{\partial y})$ .	Constructs the orientation histogram $h_{Θ} (x, y) = \sum_{j, i} ω (x + i, y + j) \cdot δ [Θ, θ (x + i, y + j)]$ for image segmentation.
[38]	Defines the gradient orientation as $θ (x, y) = arctan (I * d_{y} / I * d_{x})$ , where I is an image window, $d_{x} = [- 1, 0, 1]$ and $d_{y} = {[- 1, 0, 1]}^{T}$ .	Calculates the histogram vector $v_{t} = {b_{t}^{0}, b_{t}^{1}, \dots, b_{t}^{N - 1}}$ to interpret geological object on seismic image.
[40]	Defines the gradient orientation as $θ (x, y) = arctan (\frac{\partial f}{\partial y} / \frac{\partial f}{\partial x})$ .	Yields the histogram of oriented gradient (HOG) to enhance the crack detection.
Proposed	Defines the gradient orientation as $θ (x, y) = arctan (\frac{\partial f}{\partial y} / \frac{\partial f}{\partial x})$ .	Constructs the gradient orientation histogram $h (m, n) = Count {f (x, y) = m and θ (x, y) = n}$ , which is used to cluster the pixels.

Table 2. PSNR and FSIM values with different j at threshold level 2.

j	Board		Bag		Tire		Woven_0118		Lacelike_0053		Pitted_0009		Blotchy_0085
j	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM
2	20.5403	0.8654	21.7597	0.8037	22.1534	0.7417	24.0976	0.7335	27.1553	0.8219	22.9091	0.8141	24.6104	0.8408
3	20.5560	0.8755	21.7905	0.8198	22.1416	0.7381	24.1442	0.7331	27.1465	0.8197	22.9227	0.8236	24.6096	0.8393
4	20.5724	0.8817	21.7939	0.8200	22.1688	0.7423	24.1498	0.7366	27.1712	0.8299	22.9255	0.8271	24.6108	0.8400
5	20.5847	0.8828	21.8023	0.8210	22.1710	0.7438	24.1769	0.7420	27.1732	0.8292	22.9509	0.8372	24.6119	0.8409

Table 3. Comparisons of PSNR and FSIM values for multi-level thresholding on images of Figure 4 by using 7 different algorithms.

Image	Thresholds	Kapur (1-D)		Otsu (1-D)		Otsu (2-D)		QOMVO (1-D)		Adaptive (1-D)		ACM-Tsallis (2-D)		Proposed
Image	Thresholds	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM
board	2	19.8486	0.8417	20.5365	0.8594	20.1805	0.8603	20.0574	0.8462	20.2337	0.8499	20.3342	0.8559	20.5403	0.8654
	3	22.0879	0.8925	23.0899	0.9254	20.9382	0.8703	22.2054	0.8947	22.4471	0.9001	22.5171	0.9196	23.0959	0.9304
	4	24.1754	0.9367	25.1001	0.9489	21.0960	0.8721	24.5454	0.9419	24.6850	0.9429	24.7531	0.9461	25.1076	0.9532
	5	25.9686	0.9539	26.7874	0.9635	21.0735	0.8718	26.1119	0.9548	26.1548	0.9552	26.3624	0.9592	26.7947	0.9665
bag	2	21.0917	0.7064	21.7318	0.7291	21.6042	0.6936	21.0939	0.7054	21.1872	0.7132	21.6345	0.7457	21.7597	0.8037
	3	23.9386	0.8154	24.1852	0.8388	22.1903	0.7096	23.9386	0.8154	23.9274	0.8146	24.0522	0.8301	24.2273	0.8677
	4	25.8303	0.8759	25.9751	0.8839	22.3528	0.7113	25.8304	0.8759	25.8208	0.8797	25.9087	0.8853	26.0317	0.9124
	5	27.2505	0.9106	27.5939	0.9154	22.3593	0.7113	27.3375	0.9114	27.2353	0.9101	27.4899	0.9119	27.6188	0.9226
tire	2	20.9682	0.6295	22.0913	0.6762	21.0869	0.6406	20.9682	0.6295	20.9206	0.6290	21.0608	0.6408	22.1534	0.7417
	3	24.7329	0.7427	24.7887	0.7439	24.2287	0.7270	24.7018	0.7433	24.6246	0.7421	24.6331	0.7433	24.8545	0.8013
	4	26.5611	0.7991	26.6432	0.8020	26.2494	0.7919	26.5435	0.7977	26.4583	0.7963	26.4832	0.7959	26.6680	0.8362
	5	27.4821	0.8282	28.2698	0.8414	26.6968	0.8075	27.4828	0.8289	27.4711	0.8282	27.6303	0.8296	28.3058	0.8686

Table 4. Computation time (seconds) of two-level thresholding of Figure 4 by using different algorithms.

	Kapur (1-D)	Otsu (1-D)	Otsu (2-D)	Proposed
board	0.32	0.27	147,103.22	2.35
bag	0.32	0.27	135,890.45	1.25
tire	0.32	0.27	142,732.34	1.22

Table 5. The optimal PSNR/FSIM distributions among 7 algorithms by using a DTD image dataset at a given number of thresholds.

Image	Thresholds	Kapur (1-D)		Otsu (1-D)		Otsu (2-D)		QOMVO (1-D)		Adaptive (1-D)		ACM-Tsallis (2-D)		Proposed
Image	Thresholds	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM	PSNR	FSIM
woven (81 images)	2	0	0	2	6	0	7	0	0	0	4	0	13	81	51
	3	0	0	0	1	0	0	0	2	0	2	0	22	81	54
	4	0	2	6	1	0	0	0	1	0	4	0	18	75	55
	5	0	3	1	2	0	0	0	1	0	2	0	21	80	53
lacelike (119 images)	2	0	3	1	4	0	5	0	0	0	2	0	26	119	79
	3	0	1	1	5	0	0	0	0	0	3	0	31	118	79
	4	0	3	16	6	0	0	0	0	0	3	0	26	103	81
	5	0	2	1	11	0	0	0	1	0	2	0	17	118	86
pitted (107 images)	2	0	7	10	10	0	2	0	7	0	4	0	37	104	46
	3	0	4	5	7	0	0	0	0	0	1	0	23	103	73
	4	0	2	6	14	0	0	0	2	0	1	0	27	105	64
	5	0	3	12	11	0	0	0	2	0	0	0	26	99	68
blotchy (119 images)	2	0	2	15	10	2	4	0	3	0	0	0	7	113	100
	3	0	1	13	12	0	1	0	2	0	1	0	9	112	99
	4	0	3	8	9	0	1	0	1	0	1	0	13	115	92
	5	0	1	14	8	0	0	0	0	0	0	0	17	109	95

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, L.; Zhang, K.; He, M.; Zhong, S.; Ou, C. A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram. Appl. Sci. 2025, 15, 9808. https://doi.org/10.3390/app15179808

AMA Style

Dong L, Zhang K, He M, Zhong S, Ou C. A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram. Applied Sciences. 2025; 15(17):9808. https://doi.org/10.3390/app15179808

Chicago/Turabian Style

Dong, Lijie, Kailong Zhang, Mingyue He, Shenxin Zhong, and Congjie Ou. 2025. "A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram" Applied Sciences 15, no. 17: 9808. https://doi.org/10.3390/app15179808

APA Style

Dong, L., Zhang, K., He, M., Zhong, S., & Ou, C. (2025). A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram. Applied Sciences, 15(17), 9808. https://doi.org/10.3390/app15179808

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Local Thresholding Algorithm for Image Segmentation by Using Gradient Orientation Histogram

Abstract

1. Introduction

2. Related Works

2.1. Pixel-Wise Gradient Orientation

2.2. 1-D Otsu Method

3. Proposed Method

3.1. Construction of 2-D Gradient Orientation Histogram

3.2. Main Step of Segmentation

3.3. Comparison of Existing Similar Methods

3.4. Image Test Sets and Quality Evaluation Parameters

4. Experimental Results and Discussions

4.1. Algorithm Performance and Computational Cost

4.2. Consistency of Algorithm Performance with Increasing Threshold

4.3. Comparative Analysis of Method Performance

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI