Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement

Liu, Yuhao; Pan, Yuzhe; Zhang, Hui

doi:10.3390/app16125843

Open AccessArticle

Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement

by

Yuhao Liu

,

Yuzhe Pan

and

Hui Zhang

^*

School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(12), 5843; https://doi.org/10.3390/app16125843 (registering DOI)

Submission received: 19 May 2026 / Revised: 4 June 2026 / Accepted: 8 June 2026 / Published: 10 June 2026

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Featured Application

The proposed method can be used in low-cost monocular visual measurement systems that require accurate and interpretable subpixel localization of artificial targets or geometric edges.

Abstract

Monocular distance measurement estimates the target distance from the apparent size of a known target, and its accuracy is strongly affected by edge localization accuracy. Conventional pixel-level edge detectors provide only integer-pixel positions, which may introduce considerable errors when the target is small or far from the camera. To improve subpixel localization accuracy for blurred edges and the adaptability of fixed sampling windows, this study proposes a normalized-gradient-entropy-guided dynamic-window error-function (ERF) fitting method. An ERF model is used to describe the gray-level transition of a Gaussian-blurred step edge, and gray-level samples are collected along the local gradient direction of each Canny edge candidate. Normalized gradient entropy is introduced to characterize the local gradient distribution and adaptively select 5-, 7-, 9-, or 11-point fitting windows. Synthetic experiments show that the proposed four-parameter dynamic ERF method achieves the lowest overall RMSE and MAE among the compared localization methods, namely 0.163 pixel and 0.124 pixel, respectively. Real monocular distance-measurement experiments show that the proposed method achieved the lowest mean absolute error of 0.976 cm and the lowest mean relative error of 0.504%, demonstrating improved target-edge size extraction and ranging stability.

Keywords:

monocular distance measurement; subpixel edge localization; error function; normalized gradient entropy; adaptive sampling window; ERF fitting; industrial vision measurement

1. Introduction

Monocular distance measurement has been widely used in computer vision and industrial metrology because of its low cost and simple system structure. Typical applications include robot navigation, industrial measurement, and intelligent transportation. This type of method estimates the target distance by analyzing two-dimensional images acquired by a monocular camera. In many target-based ranging systems, the distance is calculated from the number of pixels occupied by a target of known physical size. Therefore, edge detection is a critical step in estimating the target image size. Conventional edge detection algorithms can be broadly classified into gradient-based methods, such as the Canny and Sobel operators [1,2]; statistical methods, such as SUSAN [3]; and structured or learning-based edge detection methods [2,4]. However, ranging methods that rely on pixel-level edge detection still have intrinsic limitations. Pixel-level edge detectors can only localize edges at integer-pixel positions. When the target is small or located far from the camera, the target occupies only a limited number of pixels, and even a small pixel-level localization error can be amplified into a noticeable distance-estimation error. This limits the achievable accuracy and stability of monocular distance measurement in certain industrial measurement scenarios.

To overcome the localization limit of pixel-level edge detection, subpixel edge localization has been widely adopted in high-precision visual measurement tasks, including camera calibration, dimensional measurement, and industrial inspection [5]. In recent visual measurement studies, Li et al. introduced an improved subpixel edge localization method for valve opening-area measurement and improved boundary extraction and area-measurement accuracy, demonstrating the continued importance of subpixel edge localization in visual metrology [6]. Recent industrial visual measurement studies have also used subpixel boundary or line-segment information for ring-diameter measurement and shaft-part dimensional measurement [7,8]. Li et al. further proposed a subpixel anomaly-prediction method for precision workpiece edge-defect and dimensional inspection, indicating that stable subpixel edge information remains important for precision industrial inspection and dimensional measurement [9]. Existing subpixel edge localization methods are usually divided into interpolation-based, fitting-based, and moment-based approaches [10]. Interpolation-based methods estimate edge-peak positions using local gray-level or gradient-response interpolation and are relatively simple to implement. Fitting-based methods establish a continuous edge model and estimate subpixel position parameters through optimization. Moment-based methods estimate edge position and orientation from local image moments. Among these categories, interpolation-based methods are sensitive to neighborhood noise and gradient perturbation; moment-based and certain analytical models usually rely on ideal step-edge assumptions or local geometric constraints and may suffer from model mismatch under blurred edges, asymmetric gray-level transitions, or complex backgrounds. In contrast, fitting-based methods can describe gray-level transitions through suitable continuous edge models and therefore are well suited to blurred-edge localization and high-precision measurement.

Tabatabai and Mitchell proposed a gray-level moment method that estimates edge position from the first three moments of the edge gray-level sequence and has the advantages of simple computation and high efficiency for near-sharp edges [11]. Nalwa and Binford established an early fitting-based edge detection framework using a hyperbolic tangent model [12]. Hagara and Kulla approximated the blurred edge profile with an error function and obtained subpixel edge positions through approximation-based ERF fitting, showing that the error-function model can be used to approximate continuous gray-level transitions of blurred edges [13]. Later, Zernike-moment-based methods [14] and partial-area-effect methods [15] further improved subpixel edge localization accuracy. Nevertheless, moment-based methods usually depend on ideal step-edge or local geometric assumptions, and their model adaptability may decrease when edges are affected by optical blur, defocus, noise, or asymmetric transitions. Fitting-based methods typically require parameter optimization and have higher computational costs, but they can describe gray-level transitions using physically consistent edge models, making them suitable for industrial imaging scenarios involving optical blur. The ERF fitting used in this paper is not intended to replace all moment-based methods. Instead, it targets blurred edges commonly observed in monocular distance measurement and uses the physical relationship that a step edge convolved with a Gaussian point-spread function forms an ERF-like gray-level profile, thereby improving the match between the model and the measured edge transition. For clarity, the baseline ERF model used in this study follows the blurred-edge model, which can be written as

f (x) = \frac{k}{2} [erf (\frac{x - l}{\sqrt{2} σ}) + 1] + h,

(1)

where h is the background intensity, k is the edge contrast, l denotes the subpixel edge location, and

σ

is the edge-blurring parameter. The error function is defined as

erf (x) = \frac{2}{\sqrt{π}} \int_{0}^{x} e^{- t^{2}} d t .

(2)

This model describes the gray-level transition of a blurred step edge and provides the basis for estimating the continuous subpixel edge position from discrete gray-level samples.

Conventional ERF-based and fitting-based methods usually extract a one-dimensional gray-level profile along the edge-normal direction and perform fitting within a manually fixed or empirically selected sampling range. In contrast, the main novelty of this study lies in the normalized-gradient-entropy-guided adaptive selection of the ERF fitting window. In the proposed method, the gray-level samples are still collected along the local edge-normal direction, but the sampling-window size is no longer fixed. Instead, the normalized gradient entropy is used to quantify the local gradient dispersion and determine the appropriate sampling-window size. In this way, conventional ERF fitting is extended from a manually selected sampling strategy to an entropy-guided adaptive-window framework for subpixel edge localization.

The main contributions of this paper are as follows:

1.: The ERF gray-level transition model is used as the baseline fitting model for blurred step edges in monocular distance measurement, and its relationship with Gaussian point-spread-function-based imaging is clarified, providing a physical basis for subpixel edge localization.
2.: A normalized-gradient-entropy-guided dynamic-window ERF fitting method is proposed. The normalized gradient entropy is used to quantify the spatial dispersion of the local edge-gradient distribution, thereby enabling the ERF fitting window to adaptively switch among 5-, 7-, 9-, and 11-point windows according to the local edge-transition width.
3.: Two ERF fitting schemes, namely a lightweight two-parameter version and an enhanced four-parameter version, are constructed. The effectiveness of the dynamic-window strategy is verified through fixed-window ablation experiments, and the stability of the entropy thresholds is further evaluated through threshold sensitivity analysis.
4.: The localization accuracy, robustness, and practical applicability of the proposed method are evaluated using multiple synthetic edge scenarios and real monocular distance-measurement experiments, and the results are compared with representative pixel-level, interpolation-based, moment-based, and fitting-based methods.

2. Edge Gray-Level Distribution Model and Applicability of the ERF Function

2.1. Optical Imaging Model of Edge Gray-Level Transitions

In the ideal case, the gray-level distribution of an edge can be represented by a step function, as shown in Equation (3):

I_{0} (x) = \{\begin{matrix} I_{min}, & x < x_{e}, \\ I_{max}, & x \geq x_{e}, \end{matrix}

(3)

where

I_{0} (x)

denotes the ideal gray-level function,

x_{e}

is the ideal edge location, and

I_{min}

and

I_{max}

are the gray-levels on the two sides of the edge. An ideal step edge is illustrated in Figure 1a.

In a real optical imaging system, however, the optical transfer process is not ideal. The intensity recorded by the sensor is not determined only by the ideal signal at the same spatial position, but by the combined contributions of signals from neighboring positions after spatial diffusion and smoothing. As a result, the edge transition in an actual image is gradual rather than discontinuous. Figure 1b shows a local real edge profile captured from a checkerboard target.

Assume that the ideal intensity at a spatial position

ξ

is

I_{0} (ξ)

. Owing to the blur of the imaging system, its contribution to position x can be described by a weighting function. The closer

ξ

is to x, the larger the contribution becomes; the farther away it is, the contribution decays gradually. This spatial decay is commonly approximated by a Gaussian point-spread function. Therefore, the observed edge gray-level distribution can be regarded as the convolution of the ideal step edge and the optical point-spread function [16]. If the point-spread function is approximated by a zero-mean Gaussian function, the Gaussian kernel is

G (x) = \frac{1}{\sqrt{2 π} σ_{b}} exp (- \frac{x^{2}}{2 σ_{b}^{2}}),

(4)

where

σ_{b}

is the standard deviation of the Gaussian point-spread function and characterizes the blur scale of the edge transition. The observed gray-level profile can then be written as [17]

I (x) = I_{0} (x) * G (x) = \int_{- \infty}^{+ \infty} I_{0} (ξ) G (x - ξ) d ξ .

(5)

For an ideal step edge, the convolution with a Gaussian point-spread function produces an error-function-shaped gray-level transition. Therefore, the blurred edge profile can be approximately described by

I (x) \approx B + A erf (\frac{x - x_{e}}{\sqrt{2} σ_{b}}),

(6)

where

x_{e}

is the edge position, A controls the gray-level amplitude, and B denotes the gray-level center. This physical relationship provides the basis for using the ERF model to estimate the subpixel edge position. Under the imaging condition in which a step edge is blurred by an approximately Gaussian point-spread function, the ERF model provides a physically consistent description of the edge gray-level transition. The edge position can be estimated from the inflection point of the fitted profile. An example of gray-level variation and an S-shaped fitted curve is shown in Figure 2.

2.2. Common Edge Fitting Functions and Their Applicability

Common edge fitting functions include the hyperbolic tangent function [18], the arctangent function [19], and the Gaussian function [20]. These models fit gray-level or gradient profiles around edge pixels and obtain subpixel edge coordinates by minimizing a least-squares objective. The common fitting models are summarized below.

The hyperbolic tangent model is

I (t) = A tanh (\frac{t - μ}{σ}) + B,

(7)

where t is the sampling coordinate along the gradient direction, A is the gray-level amplitude, B is the gray-level center,

μ

is the subpixel edge offset, and

σ

is the blur-scale parameter. The hyperbolic tangent function is defined as

tanh (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}} .

(8)

The arctangent model is

I (t) = A arctan (\frac{t - μ}{σ}) + B .

(9)

The Gaussian model, which is often used to model the gradient profile, can be expressed as

G (t) = A exp [- \frac{{(t - μ)}^{2}}{2 σ^{2}}] + B .

(10)

The hyperbolic tangent function can describe smooth gray-level transitions and is a common model in early fitting-based subpixel localization, but its fixed functional form may lead to model mismatch under noise, asymmetric transitions, or complex backgrounds. The Gaussian model is suitable for fitting bell-shaped gradient profiles and can directly describe the gradient response of a blurred edge, but its localization result is sensitive to the quality of gradient estimation and local noise. The arctangent model can describe relatively gradual gray-level transitions and offers a trade-off between accuracy and complexity, but its parameter estimation remains affected by the local gray-level range and transition shape. Because the ERF profile follows naturally from the Gaussian-blurred step-edge model, it is more physically consistent for the blurred-edge localization problem considered in this work.

3. Normalized-Gradient-Entropy-Guided Dynamic-Window ERF Subpixel Edge Localization

3.1. Localization Pipeline

The proposed subpixel edge localization pipeline consists of four steps:

1.: Canny edge detection is first applied to obtain pixel-level candidate edge points.
2.: At each candidate point, the local gradient direction is estimated, and normalized gradient entropy is calculated within an initial 11-point window along the gradient direction.
3.: According to the mapping between normalized gradient entropy and sampling-window size, the final ERF fitting window is determined.
4.: Gray-level samples in the selected window are fitted using either a two-parameter or a four-parameter ERF model, and the subpixel edge position is obtained.

3.2. ERF Model Fitting and Parameter Initialization

The ERF model used for gray-level profile fitting is

I (t) = A erf (\frac{t - μ}{\sqrt{2} σ}) + B,

(11)

where

I (t)

is the gray value at sampling coordinate t, A is the gray-level amplitude, B is the gray-level center,

μ

is the fitted subpixel edge offset along the sampling direction, and

σ

is the blur-scale parameter. The Gaussian error function follows the definition given in Equation (2).

Reasonable initialization is required to improve the convergence stability of nonlinear ERF fitting. For a given dynamic sampling window, let the one-dimensional gray-level sampling sequence along the local gradient direction be

g (t_{i})

, where

t_{i} = - N, \dots, N

. The gray-level amplitude and center are initialized from the local maximum and minimum gray values in the window:

A_{0} = \frac{g_{max} - g_{min}}{2}, B_{0} = \frac{g_{max} + g_{min}}{2},

(12)

where

g_{max}

and

g_{min}

are the maximum and minimum gray values in the current sampling window. When the sampled gray-level profile decreases along the gradient direction,

A_{0}

is assigned a negative sign to accommodate both bright-to-dark and dark-to-bright edge transitions.

The initial edge position is estimated from the intersection between the sampled gray-level sequence and the half-gray-level. If two adjacent samples satisfy

[g (t_{i}) - B_{0}] [g (t_{i + 1}) - B_{0}] \leq 0,

(13)

then linear interpolation gives

μ_{0} = t_{i} + \frac{B_{0} - g (t_{i})}{g (t_{i + 1}) - g (t_{i})} (t_{i + 1} - t_{i}) .

(14)

If no valid intersection is detected,

μ_{0} = 0

is used, i.e., the Canny edge point is taken as the initial edge center. The initial blur-scale parameter is set to

σ_{0} = max (0.5, \frac{N}{3}),

(15)

where N is the half-window size of the current ERF fitting window. This setting assigns a larger initial blur scale to a larger sampling window, which helps adapt to different gray-level transition widths. To ensure

σ > 0

, the optimization is implemented using

log σ

as the iterative variable.

The built-in erf function in MATLAB R2024b (MathWorks, Natick, MA, USA) was used to compute the model gray values, and numerical optimization was applied to minimize the sum of squared residuals between the sampled gray values and the model. Since the numerical accuracy of the error function is far higher than that required for subpixel localization in this study, its implementation error is not discussed separately. The least-squares objective is

E (θ) = \sum_{i = 1}^{2 N + 1} {[I (t_{i}; θ) - g (t_{i})]}^{2},

(16)

where

g (t_{i})

is the measured gray value at the ith sampling point and

θ

denotes the set of parameters to be optimized. The optimized

μ

is the subpixel offset along the local gradient direction. By projecting this offset onto the image coordinate system, the subpixel edge coordinate is obtained.

To balance efficiency and accuracy, this paper constructs two ERF fitting forms. In the two-parameter ERF form, A and B are initialized from local gray-level extrema and kept fixed, and only

μ

and

σ

are optimized. In the four-parameter ERF form, A, B,

μ

, and

σ

are all optimized. The subsequent experiments compare these two forms to analyze the trade-off between localization accuracy and computational efficiency.

3.3. Normalized-Gradient-Entropy-Guided Adaptive Sampling-Window Selection

The pixel width of the edge gray-level transition region is positively correlated with image blur. A more blurred edge occupies a wider transition region from foreground to background, whereas a sharper edge has a more concentrated gray-level change and can be represented with fewer samples. Therefore, a fixed sampling window cannot simultaneously achieve high accuracy and high efficiency under different blur levels. If the sampling window is too small, the full transition of a blurred edge may not be covered, causing ERF fitting bias. If the sampling window is too large, redundant background and noise samples may be introduced for sharp edges, increasing computational cost and reducing fitting stability.

To adaptively select the sampling window, the local blur feature of the edge region must be quantified. In real images, however, the optical blur parameter is usually unavailable. Since edge blur changes the distribution pattern of local gradient magnitudes, this paper introduces normalized gray-gradient entropy as a quantitative descriptor of the local gradient distribution [21]. For simplicity, it is referred to as normalized gradient entropy hereafter. For a sharp edge, gradient magnitudes are concentrated near the edge center, and the normalized gradient distribution has a peaked shape, resulting in low entropy. For a blurred edge, the gray-level transition region becomes wider, and gradient magnitudes spread over a wider sampling range, resulting in higher entropy. Thus, normalized gradient entropy can serve as the basis for adaptive ERF fitting-window selection.

Specifically, at each pixel-level edge point obtained by the Canny operator, an initial one-dimensional gray-level sampling sequence is constructed along the local gradient direction. The initial half-window size is set to

N_{0} = 5

, i.e.,

m = 2 N_{0} + 1 = 11

candidate samples are collected. This initial window is used only for normalized gradient entropy calculation and is not necessarily the final ERF fitting window. The use of 11 samples is not based on the assumption that an 11-point window is always optimal; rather, it is consistent with the 5/7/9/11 candidate fitting windows and covers the main edge transition regions under the measurement conditions considered in this paper while avoiding excessive disturbance from texture or non-ideal edges.

Let

G_{i}

be the gradient magnitude of the ith sampling point. Its normalized probability is defined as

p_{i} = \frac{G_{i} + ε}{\sum_{j = 1}^{m} (G_{j} + ε)},

(17)

where

ε

is a small constant used to avoid numerical instability. The normalized gray-gradient entropy is then defined as

H_{n} = - \frac{1}{log m} \sum_{i = 1}^{m} p_{i} log p_{i},

(18)

where

H_{n} \in [0, 1]

.

To verify whether normalized gradient entropy effectively represents edge blur, synthetic Gaussian-blurred ideal edge images with known ground-truth edge positions were constructed using an area-integral ideal-edge generation method [22]. The synthetic images had a size of

400 \times 300

pixels, and the gray-levels on the two sides of the edge were set to

I_{min} = 50

and

I_{max} = 200

. The blur parameter was set as

σ_{b} \in {0.5, 1.0, 1.5, 2.0, 2.5}

to simulate transition widths ranging from sharp to blurred. The noise conditions were set as

SNR \in {\infty, 38 dB, 30 dB, 25 dB}

, where

SNR = \infty

denotes noise-free images, 38 dB corresponds to the measured image signal-to-noise ratio in the real monocular ranging scene, and 30 dB and 25 dB simulate stronger noise. Edge-normal directions were set to 30°, 45°, and 60° to evaluate adaptability to different edge orientations. Each parameter combination was repeated 10 times with random noise, yielding 600 synthetic edge samples.

For each synthetic edge sample, a fixed initial 11-point window was placed near the ground-truth edge along the local gradient direction to compute normalized gradient entropy. Figure 3 shows the relationship between normalized gradient entropy and edge blur under different noise conditions.

As shown in Figure 3a, normalized gradient entropy increases with the blur parameter. When

σ_{b} = 0.5

, the edge transition region is narrow, and gradient magnitudes are mainly concentrated near the edge center; therefore,

H_{n}

is relatively low. As

σ_{b}

increases, the gray-level transition widens, and gradient magnitudes are distributed over a broader sampling region, causing

H_{n}

to increase. When

σ_{b}

reaches 2.0 or 2.5,

H_{n}

approaches a relatively high level, indicating a more dispersed gradient distribution and a wider transition region.

Figure 3b further shows the relationship between

H_{n}

and the blur parameter under different SNR conditions. Under noise-free, 38 dB, 30 dB, and 25 dB conditions,

H_{n}

increases monotonically with the blur parameter, demonstrating that it can maintain its descriptive capability for edge blur under different noise levels. For a fixed blur parameter, lower SNR values generally result in larger

H_{n}

, especially for sharp edges, because noise perturbs the local gradient distribution and spreads gradient energy from the edge center to surrounding samples. Nevertheless, the overall relationship between

H_{n}

and

σ_{b}

remains consistent across SNR conditions.

The above results indicate a stable positive correlation between normalized gradient entropy and edge blur. Therefore, normalized gradient entropy can be used as a quantitative measure of the local edge transition width. For low-entropy sharp edges, a smaller ERF fitting window can reduce redundant samples and computational overhead. For high-entropy blurred edges, a larger fitting window can cover the complete gray-level transition and reduce fitting bias caused by undersampling. Thus, it is reasonable and feasible to use normalized gradient entropy to guide adaptive ERF fitting-window selection.

After validating the relationship between normalized gradient entropy and blur, this paper establishes a mapping between

H_{n}

and the ERF fitting-window size. For each synthetic edge image, the Canny operator is used to obtain pixel-level edge points, and an initial 11-point window is used to calculate

H_{n}

. At the same edge point, ERF fitting is then performed using 5, 7, 9, and 11 samples, and localization error and computational time are recorded.

For an image containing K candidate Canny edge points, let

e_{k, N}

denote the subpixel localization error of the kth edge point with half-window size N. The RMSE and MAE for a given window are defined as

{RMSE}_{N} = \sqrt{\frac{1}{K} \sum_{k = 1}^{K} e_{k, N}^{2}} .

(19)

{MAE}_{N} = \frac{1}{K} \sum_{k = 1}^{K} | e_{k, N} | .

(20)

To balance localization accuracy and computational efficiency, this paper does not simply select the window with the smallest RMSE. Instead, the fastest window is selected under the condition that its localization error is close to the best error. Let

{RMSE}_{min} = min_{N \in N} {RMSE}_{N}, N = {2, 3, 4, 5},

(21)

where the corresponding sampling-point set is

2 N + 1 \in {5, 7, 9, 11}

. A window is considered approximately optimal in localization accuracy if

{RMSE}_{N} \leq α {RMSE}_{min} .

(22)

In this study,

α

was set to 1.02, allowing the localization error to increase by at most 2% relative to the minimum. Among the candidate windows satisfying Equation (22), the window with the minimum average computation time is selected:

N^{*} = arg min_{N \in N, {RMSE}_{N} \leq α {RMSE}_{min}} T_{N},

(23)

where

T_{N}

denotes the average post-Canny computation time per edge point for half-window size N.

After obtaining the normalized gradient entropy and the corresponding optimal sampling-point number for each calibration image, a grid search is used to determine the thresholds

τ_{1}

,

τ_{2}

, and

τ_{3}

so that the sampling-point number predicted from

H_{n}

is as consistent as possible with the calibrated optimal sampling-point number. The final mapping is

W (H_{n}) = \{\begin{matrix} 5, & H_{n} < τ_{1}, \\ 7, & τ_{1} \leq H_{n} < τ_{2}, \\ 9, & τ_{2} \leq H_{n} < τ_{3}, \\ 11, & H_{n} \geq τ_{3} . \end{matrix}

(24)

According to the calibration experiment under the imaging and noise conditions considered in this study,

τ_{1} = 0.630

,

τ_{2} = 0.795

, and

τ_{3} = 0.885

. These thresholds can be recalibrated when the imaging system, target type, or illumination condition changes substantially. The final mapping relationship is listed in Table 1.

In the detection stage, Canny edge points are first obtained, and an initial 11-point window is used to calculate

H_{n}

along the local gradient direction. The final ERF fitting-window size is then determined using Table 1, and the gray-level samples in that window are fitted to obtain the subpixel edge position.

3.4. Theoretical Justification for Entropy-Guided Window Selection

The use of normalized gradient entropy for adaptive window selection can be theoretically interpreted from the gradient distribution of the ERF-based blurred-edge model. Starting from the baseline ERF model in Equation (1), its gray-level gradient can be obtained by taking the derivative with respect to the spatial coordinate:

g (x) = \frac{d f (x)}{d x} = \frac{k}{\sqrt{2 π} σ} exp [- \frac{{(x - l)}^{2}}{2 σ^{2}}],

(25)

where

g (x)

denotes the gray-level gradient, k is the edge contrast, l is the subpixel edge location, and

σ

is the edge-blurring parameter. Equation (25) shows that the gradient profile of an ERF-modeled blurred edge has a Gaussian-shaped distribution. Therefore, the parameter

σ

controls the spatial dispersion of the edge gradient: a smaller

σ

corresponds to a sharper edge with concentrated gradient energy, whereas a larger

σ

corresponds to a more blurred edge with gradient energy distributed over a wider spatial range.

The normalized gradient entropy defined in the previous subsection measures the dispersion of the normalized local gradient distribution. When the edge is sharp, most of the gradient energy is concentrated near the edge center, and the normalized gradient distribution is highly peaked, resulting in a lower entropy value. As the blur scale increases, the gradient energy spreads over a wider spatial range, and the entropy increases accordingly. This trend is consistent with the differential entropy of a Gaussian distribution:

h_{g} = \frac{1}{2} log (2 π e σ^{2}),

(26)

which increases monotonically with

σ

. Therefore, under the Gaussian-blurred step-edge assumption, the normalized gradient entropy can be regarded as an interpretable indicator of the local edge-transition width.

The fitting-window size should also increase with the edge-transition width. Here, L denotes the continuous width of the fitting window along the edge-normal direction. In the discrete implementation, this continuous width corresponds to the candidate windows with 5, 7, 9, and 11 sampling points. If a fitting window with width L is expected to cover a fixed proportion

η

of the gradient energy, the required width can be expressed as

L \geq 2 \sqrt{2} σ {erf}^{- 1} (η),

(27)

which shows that the appropriate fitting-window width is positively related to the blur scale

σ

. Therefore, both the normalized gradient entropy and the appropriate fitting-window size increase as the edge transition becomes more spatially dispersed.

In summary, the above analysis shows that both the normalized gradient entropy and the required fitting-window width are positively related to the blur scale:

σ ↑ \Rightarrow H_{n} ↑, σ ↑ \Rightarrow L ↑ .

(28)

Consequently, the normalized gradient entropy and the fitting-window width follow the same increasing trend with respect to the edge-transition dispersion:

H_{n} ↑ \Rightarrow L ↑ .

(29)

This relationship does not imply that a single entropy value universally determines an optimal window size under all imaging conditions. Instead, it indicates that normalized gradient entropy provides a physically interpretable monotonic indicator for selecting a larger or smaller fitting window according to the local edge-transition width.

In the proposed method, this continuous relationship is discretized into four candidate windows, namely 5, 7, 9, and 11 sampling points. The entropy thresholds used to switch among these windows are calibrated parameters under the current imaging configuration rather than universal constants. For substantially different cameras, lenses, illumination conditions, or target materials, the same calibration procedure can be repeated to update the thresholds.

4. Results and Discussion

4.1. Synthetic Edge Image Construction and Experimental Parameter Settings

To objectively evaluate the accuracy and robustness of different subpixel edge localization methods, synthetic edge images with known continuous ground-truth edge positions were first constructed. Compared with real images, synthetic images provide precise continuous edge positions, which facilitates the calculation of localization errors between estimated subpixel edge points and the ground-truth edge. Therefore, controlled edge images with different degradation conditions were generated for the subsequent ablation and comparison experiments.

Section 3 described the area-integral ideal-edge generation method and the threshold calibration process based on the synthetic data. The synthetic samples used for threshold calibration were independent of those used for performance evaluation in this section, so as to avoid data overlap between window-threshold calibration and method testing. To ensure consistency, the synthetic edge images in this section use the same basic generation model. Specifically, the bright-side area ratio within each pixel region is calculated from the ground-truth edge line to generate an area-integral ideal edge. Gaussian blur is then applied to the ideal edge, and different scenarios are produced by adding background gray-level slopes, local texture, or asymmetric blur. Finally, Gaussian noise is added.

The image size was

400 \times 300

pixels, and the gray-levels on the two sides of the edge were

I_{min} = 50

and

I_{max} = 200

. Edge-normal directions were set to

30^{\circ}

,

45^{\circ}

, and

60^{\circ}

to evaluate the adaptability of different methods to edge orientation. Noise conditions were set to

SNR = \infty

, 38 dB, 30 dB, and 25 dB. The maximum blur parameter in the main experiment was set to

σ_{b} = 2.5

to cover sharp, slightly blurred, and moderately strong blurred edges. Four synthetic edge scenarios were constructed: Ideal, Slope, Texture, and Asymmetric. The Ideal scenario evaluates baseline localization performance under standard Gaussian-blurred step edges. The Slope scenario adds a background gray-level slope to simulate uneven illumination or slowly varying background intensity. The Texture scenario adds local sinusoidal texture disturbance to simulate additional gradient interference caused by target surface texture or background texture. The Asymmetric scenario assigns different blur scales to the two sides of the edge to simulate asymmetric gray-level transitions. The parameter settings are listed in Table 2.

For each image, the Canny operator was first used to obtain pixel-level edge candidates. All subsequent subpixel localization methods refined the edge positions within the same candidate-edge neighborhoods to ensure fair comparison. The perpendicular distance from an estimated subpixel edge point to the ground-truth edge line was used as the localization error. RMSE, MAE, and average post-Canny computation time per edge point were used to evaluate localization accuracy and computational efficiency. To ensure a fair comparison, all methods were implemented under the same software environment and tested on the same computer. The same Canny edge candidates and identical optimization settings were used for all compared methods. The reported computation time excludes the common Canny edge detection step and represents only the average post-Canny subpixel refinement time per edge point.

4.2. Ablation Study of the Normalized-Gradient-Entropy-Guided Dynamic Window

To verify the effectiveness of the normalized-gradient-entropy-guided adaptive sampling-window strategy, fixed-window ablation experiments were conducted. The comparison was performed only within the ERF fitting framework. Except for the sampling-window selection strategy, the Canny initialization, gradient-direction estimation, gray-level sampling, parameter initialization, and nonlinear fitting process were kept identical. This design isolates the effect of adaptive window selection from other algorithmic factors.

Fixed ERF fitting windows with 5, 7, 9, and 11 sampling points were compared with the proposed normalized-gradient-entropy-guided dynamic window strategy. For the dynamic-window method, normalized gradient entropy was first calculated within an initial 11-point window at each Canny edge point. The final ERF fitting window was then determined according to the calibrated thresholds in Section 3. To analyze the influence of ERF parameter optimization, both two-parameter and four-parameter ERF fitting forms were tested. In the two-parameter ERF form, the gray-level amplitude and offset were initialized from local extrema and then fixed; only the subpixel offset and blur-scale parameter were optimized. In the four-parameter ERF form, A, B,

μ

, and

σ

were optimized simultaneously. The following ten methods were compared: ERF2P-fixed5, ERF2P-fixed7, ERF2P-fixed9, ERF2P-fixed11, ProposedDynamicERF_2P, ERF4P-fixed5, ERF4P-fixed7, ERF4P-fixed9, ERF4P-fixed11, and ProposedDynamicERF_4P.

4.2.1. Dynamic Window Selection Results

Table 3 lists the selection ratios of sampling points for the dynamic window strategy in the four synthetic scenarios. The statistics are computed over Canny candidate edge points participating in subpixel localization.

Table 3 and Figure 4 show that the dynamic window selection ratios differ considerably among scenarios. In the Ideal scenario, all four window sizes are selected with non-negligible proportions, and the 11-point window accounts for 41.185%, indicating that the sampling window changes with the local transition width under different blur levels. In the Slope and Texture scenarios, the 9-point and 11-point windows have relatively high proportions. Specifically, the combined proportions of the 9-point and 11-point windows are 67.855% and 68.210% in the Slope and Texture scenarios, respectively. This suggests that background gray-level slopes and local texture disturbances disperse the gradient distribution around edges and that larger windows help cover the complete transition region. In the Asymmetric scenario, the 7-point and 9-point windows dominate, accounting for 80.586% in total, while the 11-point window accounts for only 11.065%. This indicates that an excessively large window is not always beneficial for asymmetric transitions; instead, a medium-sized window can better balance effective transition coverage and redundant-sample suppression.

These results demonstrate that the normalized-gradient-entropy-guided dynamic window does not simply select a fixed sampling-point number. Rather, it adjusts the ERF fitting window according to the local edge condition.

4.2.2. Sensitivity Analysis of Entropy Thresholds

To further evaluate the stability of the entropy thresholds used for switching among the 5-, 7-, 9-, and 11-point fitting windows, an additive threshold sensitivity analysis was conducted. Since the normalized gradient entropy is bounded within

[0, 1]

, additive perturbations were used instead of proportional scaling to avoid invalid threshold values near the upper bound. Let

τ_{1} = 0.630

,

τ_{2} = 0.795

, and

τ_{3} = 0.885

denote the calibrated entropy thresholds. The perturbed thresholds were defined as

τ_{i}^{'} = τ_{i} + Δ, i = 1, 2, 3,

(30)

where

Δ \in {- 0.04, - 0.03, - 0.02, - 0.01, 0, 0.01, 0.02, 0.03, 0.04}

. For each perturbed threshold setting, the proposed dynamic-window ERF method was re-evaluated using the same synthetic test samples. The MAE, RMSE, standard deviation, relative RMSE change, and selected-window distribution were calculated. The relative RMSE change was computed with respect to the calibrated thresholds at

Δ = 0

:

Δ_{RMSE} = \frac{RMSE (Δ) - RMSE (0)}{RMSE (0)} \times 100 % .

(31)

The results in Table 4 and Figure 5 show that the calibrated thresholds achieve the lowest RMSE of 0.13501 pixel. When the threshold offset varies from

- 0.04

to

0.04

, the maximum relative RMSE change is 4.81%. Within the range of

Δ = \pm 0.03

, the maximum relative RMSE change is only 3.83%. These results indicate that the proposed entropy-guided window selection strategy is not overly sensitive to moderate additive perturbations of the entropy thresholds.

An asymmetric trend can also be observed. Positive threshold offsets lead to a faster increase in RMSE than negative offsets. This is because increasing the thresholds makes the method less likely to select larger fitting windows. As a result, some blurred edges that originally require 9- or 11-point windows may be assigned to smaller windows, leading to insufficient coverage of the gray-level transition and increased fitting bias. In contrast, decreasing the thresholds tends to select slightly larger windows. Although larger windows may introduce additional background or noise, they still preserve the complete edge-transition region, and therefore cause only limited degradation within the tested perturbation range.

It should be noted that the entropy thresholds are calibrated parameters rather than universal constants. Their transferability depends on whether the local gradient-entropy distribution remains similar under different imaging conditions. Changes in camera resolution, lens point-spread function, defocus, or motion blur may alter the edge-transition width in pixel units and thus shift the distribution of

H_{n}

. In general, stronger optical blur or defocus tends to produce more spatially dispersed gradients, resulting in higher

H_{n}

values and a higher probability of selecting larger fitting windows. Illumination and image contrast can also affect the reliability of the entropy measurement. Under sufficient illumination and high contrast, the normalized gradient distribution is mainly determined by the edge transition itself, so the calibrated thresholds are expected to remain relatively stable. In contrast, low illumination, low contrast, high sensor gain, or strong image noise may introduce additional gradient fluctuations, which can increase the measured entropy and reduce the reliability of the original thresholds. Object type and surface material may have similar effects: matte high-contrast targets usually produce stable edge profiles, whereas reflective, textured, or low-contrast surfaces may change the local gradient distribution and require threshold recalibration. Therefore, when the camera, lens, illumination condition, image contrast, or target material changes substantially, recalibration using the same procedure is recommended.

4.2.3. Comparison Between Fixed and Dynamic Windows

To further verify whether the dynamic window strategy improves localization accuracy, it was compared with fixed sampling windows. The overall results are shown in Table 5.

For the two-parameter ERF methods, ProposedDynamicERF_2P achieved a mean RMSE of 0.16436 pixel, which is approximately 5.23% lower than the best fixed-window method. Its mean MAE was 0.14114 pixel, also lower than those of all fixed two-parameter ERF methods. This indicates that, in the ERF fitting framework with fixed A and B, the dynamic window effectively reduces localization errors caused by fixed-window mismatch. For the four-parameter ERF methods, ERF4P-fixed11 achieved the best fixed-window RMSE of 0.15604 pixel, whereas ProposedDynamicERF_4P achieved a mean RMSE of 0.14646 pixel, corresponding to an improvement of about 6.14%. Its mean MAE was also the lowest among the four-parameter methods. Notably, the average computation time of the dynamic methods was lower than that of the maximum fixed window. This shows that the accuracy improvement is not achieved by always selecting the largest window, but by choosing an appropriate sampling range according to local edge conditions.

4.3. Accuracy and Efficiency Comparison with Other Subpixel Edge Localization Methods

To further evaluate the comprehensive performance of the proposed method, it was compared with several representative subpixel edge localization methods. The comparison methods include the pixel-level Canny baseline, the Canny–Devernay method based on interpolated non-maximum suppression [23], moment-based methods, the partial-area-effect model, curve-fitting methods, and a recent stable-region-based subpixel localization baseline.

Different original methods often include their own preprocessing operations, pixel-level edge extraction strategies, threshold settings, and edge-selection rules. Directly comparing full systems may confound the effects of coarse localization, edge-point quantity, preprocessing strength, and subpixel correction models. To ensure fairness, a unified experimental framework was adopted. First, the same Canny detector was applied to each synthetic image to obtain pixel-level candidate edge points. Then, different subpixel models were used to refine the same candidate points. Therefore, this experiment compares the localization capability of different subpixel correction models under unified Canny candidates, rather than the complete edge detection systems proposed in the original literature.

The evaluated methods were as follows. CannyPixel directly uses integer-pixel edge points obtained by the Canny detector [1]. Canny–Devernay refines the position by quadratic interpolation of the gradient-response peak along the gradient direction [23]. GLM1984 follows the gray-level moment idea of Tabatabai and Mitchell and estimates edge position from the first three moments of a local gray-level sequence [11]. Canny–Zernike2020 represents a Zernike-moment-based subpixel edge localization method [14]. HagaraAEF2011 approximates the edge gray-level profile with an error function and estimates the subpixel position by parameter fitting [13]. PAE2013 uses the partial area effect to model gray-level formation when an edge passes through a pixel region [15]. GaussianFit2014 fits the gradient profile using a Gaussian function [20]. ArctanFit2016 uses an arctangent edge model [19], and SigmoidLogisticFit2023 uses a logistic function to describe gray-level transition [24].

To further strengthen the benchmark, SER-CIS was added as a recent stable-region-based, region-adaptive subpixel localization baseline [25]. SER-CIS introduces converted intensity summation and stable edge regions for robust subpixel edge localization. In this study, a SER-CIS-style benchmark implementation was constructed according to the CIS and stable-region parameter-estimation principles described in the original paper. The proposed methods include ProposedDynamicERF_2P and ProposedDynamicERF_4P, corresponding to two-parameter and four-parameter ERF fitting under the normalized-gradient-entropy-guided dynamic window. Deep-learning-based edge detectors, such as recent transformer-based edge detectors [26,27], were not directly included in the quantitative comparison because most of them are designed to predict pixel-level edge probability maps or semantic/structural object boundaries, whereas this study focuses on metric-level subpixel edge localization and its propagation to monocular ranging error. In addition, deep models usually require task-specific training data, and their output resolution, thresholding strategy, and post-processing procedure may introduce additional factors beyond the subpixel correction model itself. Therefore, this study focuses on classical and recent model-based subpixel localization methods under a unified Canny-candidate framework, while deep-learning-assisted subpixel localization will be considered in future work.

Table 6 reports the overall performance on all synthetic edge images. In addition to MAE and RMSE, the standard deviation (SD), median error, interquartile range (IQR), and maximum error were reported to provide a more complete statistical evaluation of localization accuracy, variation, and reliability. The computation time represents the average post-Canny time required for subpixel correction of one edge point. The computation times in Table 5 and Table 6 were recorded in different experimental batches and are therefore used only as relative indicators within each table, rather than for direct cross-table comparison.

As shown in Table 6, the pixel-level CannyPixel method produced an overall RMSE of 0.41897 pixel, which is much larger than those of the subpixel methods. This confirms that integer-pixel edge localization is insufficient for high-precision edge measurement. Canny–Devernay improved the localization accuracy through gradient-peak interpolation, but its RMSE remained 0.30547 pixel, indicating that local peak interpolation is vulnerable to noise, blur, and gray-level disturbance in complex edge conditions.

Among the moment-based and analytical methods, Canny–Zernike2020, GLM1984, and PAE2013 produced RMSE values of 0.20864, 0.22587, and 0.17720 pixel, respectively. PAE2013 outperformed the other two, indicating that the partial-area-effect edge-formation model has relatively good adaptability under the synthetic conditions used in this study. However, these methods still depend on local edge orientation, gray-level distribution, and model assumptions, and may degrade under texture interference and asymmetric transitions.

Fitting-based methods generally performed well. ArctanFit2016 and SigmoidLogisticFit2023 produced RMSE values of 0.17044 and 0.17141 pixel, respectively. HagaraAEF2011 achieved an RMSE of 0.17334 pixel, showing that ERF-based fitting is effective for blurred step-edge localization. SER-CIS, as a recent stable-region-based subpixel localization baseline, achieved an RMSE of 0.22030 pixel. Its maximum error was relatively small, which may be partly related to the rejection or suppression of unstable edge regions by the stable-region selection mechanism. However, its MAE, RMSE, SD, median error, and IQR were higher than those of the proposed four-parameter dynamic-window ERF method.

ProposedDynamicERF_4P achieved the lowest MAE and RMSE among all compared methods, namely 0.12368 pixel and 0.16312 pixel. It also achieved the lowest SD, median error, and IQR, indicating that its localization errors were more concentrated and less affected by large fluctuations. Compared with SER-CIS, ArctanFit2016, SigmoidLogisticFit2023, PAE2013, and HagaraAEF2011, the RMSE was reduced by approximately 25.96%, 4.29%, 4.84%, 7.95%, and 5.90%, respectively. Although ProposedDynamicERF_4P did not achieve the smallest maximum error, the distribution-based metrics in Table 6 and the box plots in Figure 6 show that it provides better overall accuracy and reliability. ProposedDynamicERF_2P achieved an RMSE of 0.18067 pixel and an average time of 0.49560 ms, making it suitable as a lightweight solution for scenarios with stricter efficiency requirements.

To further examine whether the observed accuracy differences were statistically reliable, a Wilcoxon signed-rank test was performed on the paired sample-level MAE between ProposedDynamicERF_4P and each baseline method. Each synthetic image was treated as one paired sample, and the MAE of each method on the same synthetic image was used for paired comparison. The null hypothesis was that there was no difference in the median paired MAE between the two methods. A significance level of

p < 0.05

was used. This non-parametric test was selected because the localization-error distributions were not assumed to be strictly Gaussian and contained outliers, as shown in Figure 6.

Table 7 reports the statistical significance results. The proposed method showed statistically significant differences compared with all baseline methods at the

p < 0.05

level. The differences were especially evident for the pixel-level, interpolation-based, moment-based, Gaussian-fitting, and stable-region-based methods. For strong fitting-based baselines such as HagaraAEF2011 and SigmoidLogisticFit2023, the median paired improvements were relatively small, but the paired tests still indicated statistically significant differences. These results suggest that the overall advantage of ProposedDynamicERF_4P is reflected in the paired sample-level error distributions rather than being caused by accidental fluctuations in a small number of samples.

Table 8 further reports the RMSE results under the four synthetic scenarios, and Figure 7 visualizes the scene-wise comparison.

The per-scenario results show that different methods have distinct sensitivities to edge degradation. In the Ideal scenario, HagaraAEF2011 achieved the lowest RMSE of 0.13737 pixel because the edge profile was mainly determined by standard Gaussian blur and noise, and the gray-level transition was relatively regular. Therefore, fixed long-window AEF fitting could use the complete transition information effectively. SigmoidLogisticFit2023 and ProposedDynamicERF_4P also achieved low errors, with RMSE values of 0.13913 and 0.13923 pixel, respectively.

In the Slope scenario, HagaraAEF2011 remained highly accurate with an RMSE of 0.14329 pixel. The background gray-level slope is a low-frequency degradation and has limited influence on the overall shape of the standard blurred transition. Thus, fixed long-window fitting can still maintain good accuracy. ProposedDynamicERF_4P achieved an RMSE of 0.14504 pixel, which was slightly higher than those of HagaraAEF2011 and SigmoidLogisticFit2023 but remained comparable to other strong fitting-based methods.

In the Texture scenario, ProposedDynamicERF_4P achieved the lowest RMSE of 0.18656 pixel among the compared methods. Local texture introduces additional gradient responses around the edge, making gradient-peak interpolation, local moments, and fixed-window fitting more susceptible to interference. By characterizing the dispersion of the local gradient distribution and adjusting the window size accordingly, the proposed method covers the effective transition region while reducing the influence of texture disturbance on the fitted center. Compared with SigmoidLogisticFit2023, ArctanFit2016, and HagaraAEF2011, the RMSE reductions were approximately 0.95%, 3.22%, and 0.71%, respectively.

In the Asymmetric scenario, ProposedDynamicERF_4P again achieved the lowest RMSE of 0.18978 pixel. Compared with PAE2013, ArctanFit2016, SigmoidLogisticFit2023, HagaraAEF2011, and SER-CIS, the RMSE reductions were approximately 1.11%, 5.71%, 13.88%, 17.51%, and 43.98%, respectively. This result indicates that the proposed method has good adaptability to asymmetric gray-level transitions. Since the blur scales on the two sides of an asymmetric edge differ, a fixed long window may include unbalanced gray-level information and shift the fitted center. In contrast, the dynamic window actively adjusts the sampling range according to local gradient entropy, reducing model mismatch caused by an excessively large fixed window.

Overall, ProposedDynamicERF_4P achieved the best overall performance in terms of MAE, RMSE, SD, median error, and IQR over the complete synthetic test set. Although some fixed-window or curve-fitting methods performed slightly better under regular Ideal and Slope conditions, their errors increased under Texture or Asymmetric scenarios. The proposed method provides a more balanced performance across different edge conditions by adapting the ERF fitting window to local edge characteristics.

4.4. Real Monocular Distance Measurement Experiment and Result Analysis

To further verify the practical applicability of the proposed normalized-gradient-entropy-guided dynamic-window ERF subpixel edge localization method, a real monocular distance-measurement experiment was conducted. Unlike synthetic edge images, real images contain camera imaging noise, non-uniform illumination, lens distortion, edge blur, and background interference. Therefore, they can more directly reflect the stability and applicability of the algorithm in a practical visual measurement system. Monocular visual ranging usually estimates the target distance according to the geometric relationship between the physical target size and its image size. This type of method has a simple structure and low cost and has been used in target distance-measurement applications [28]. Recent studies on monocular machine vision and subpixel visual measurement for mechanical-part measurement also show that low-cost visual measurement systems have promising engineering potential in dimensional measurement [8,29]. Hu et al. proposed a cubic-spline-interpolation-based subpixel edge detection method for O-ring dimensional measurement, indicating that subpixel edge localization can reduce the accuracy limitations of conventional integer-pixel methods in dimensional measurement [30]. Ye et al. constructed a mobile-vision-based measurement system for precision measurement of flat-screen gaps and combined image processing with coordinate transformation to complete planar target dimension measurement, further demonstrating the application value of visual edge extraction in industrial dimensional measurement [31]. Therefore, a black square target was first used as the standard measurement object to evaluate the influence of subpixel edge localization accuracy on monocular ranging. Additional real-image robustness experiments with different imaging conditions and target appearances were further conducted in Section 4.4.6 and Section 4.4.7.

4.4.1. Experimental Platform and Camera Calibration

The real distance-measurement platform mainly consisted of a monocular camera, a fixed-focus lens, a black target, a supporting platform, and a laser rangefinder. The monocular camera was used to capture target images at different distances, whereas the laser rangefinder provided the reference distance between the camera and the target for distance-error calculation. During image acquisition, the camera position was fixed, the target plane was kept approximately parallel to the camera imaging plane, and the images were captured under relatively stable indoor illumination to reduce the influence of environmental variations on edge extraction.

Before the distance-measurement experiment, camera calibration was performed to obtain the intrinsic parameters and distortion coefficients. Zhang’s calibration method was adopted because it estimates the camera intrinsic matrix and distortion coefficients from several images of a planar checkerboard target at different poses and is convenient and accurate in practice [32]. After calibration, the captured target images were undistorted to reduce the influence of lens distortion on edge position and image-size calculation.

The intrinsic matrix obtained by camera calibration can be expressed as

K = [\begin{matrix} f_{x} & 0 & c_{x} \\ 0 & f_{y} & c_{y} \\ 0 & 0 & 1 \end{matrix}],

(32)

where

f_{x}

and

f_{y}

denote the equivalent focal lengths in the horizontal and vertical directions, respectively, and

c_{x}

and

c_{y}

are the principal-point coordinates. The equivalent focal length obtained from calibration was used in the subsequent distance calculation. Because the real ranging experiment in this study focuses on comparing the relative performance of different edge localization methods under the same visual ranging model, all methods used the same camera intrinsic parameters, distortion-correction results, and distance-measurement model.

4.4.2. Monocular Distance Model and Edge-Size Extraction

A black target with a known physical size was used in the experiment. Let the physical side length of the target be L, and let its image width be l pixels. When the target plane is approximately parallel to the camera imaging plane, the pinhole camera model gives

Z = \frac{f_{x} L}{l},

(33)

where Z is the distance between the camera and the target,

f_{x}

is the equivalent focal length in the horizontal direction, L is the physical target size, and l is the target width in pixels. Equation (33) shows that the estimation error of the image size l directly affects the distance result. As the target distance increases, the target occupies fewer pixels in the image, and the same edge localization error leads to a more pronounced ranging deviation. Therefore, improving edge localization accuracy is important for medium- and long-distance monocular ranging.

During image-size extraction, the Canny operator was first used to obtain pixel-level target-edge candidates. Different edge localization methods were then applied to refine the candidate edge positions and obtain the subpixel positions of the left and right target boundaries. Let the average subpixel positions of the left and right edges be

x_{l}

and

x_{r}

, respectively. The horizontal image width of the target can be expressed as

l_{x} = | x_{r} - x_{l} | .

(34)

In this experiment, the extracted horizontal target width was substituted into Equation (33) for distance calculation.

4.4.3. Experimental Settings and Evaluation Metrics

The real monocular distance-measurement experiment was conducted using the calibrated camera system and the prepared target images. The reference distance of each image was measured using a laser rangefinder. For each image, the target image size was extracted using different edge localization methods, and the monocular distance was then calculated using the same ranging model. To ensure a fair comparison, all methods used the same input images, the same distortion-correction procedure, the same target-size-to-distance model, and the same reference-distance annotations.

The standard real ranging experiment used a square target with a known physical side length of 13 cm. The target was placed at different distances within the range of 50–300 cm. For each method, the target image size was estimated from the localized target edges, and the corresponding distance was calculated. The calibrated focal length obtained from camera calibration was directly used in the monocular ranging model. For each test image, the target image size was first extracted using the corresponding edge localization method, and the distance was then calculated by

{\hat{Z}}_{i} = \frac{f_{x} S}{l_{i}},

(35)

where

{\hat{Z}}_{i}

is the estimated distance of the i-th image,

f_{x}

is the calibrated focal length in the horizontal direction, S is the physical size of the measured target dimension, and

l_{i}

is the corresponding target image size in pixels. For square targets,

S = L = 13 cm

and

l_{i}

denotes the estimated side length. For the circular target,

S = D = 13 cm

and

l_{i}

denotes the fitted circle diameter.

To further evaluate the robustness of the proposed method under more practical imaging conditions, two additional real-image experiments were conducted. The first experiment evaluated the influence of different SNR-related imaging conditions. In this experiment, the exposure setting was kept fixed, while the illumination intensity and camera gain were adjusted to form four imaging-quality groups. Since the actual SNR value was not directly measured, these groups are referred to as SNR-related imaging-condition groups rather than absolute SNR levels. The second experiment evaluated the influence of target diversity by using targets with different shapes, gray-level contrasts, and surface materials. These additional experiments were used to examine whether the proposed dynamic-window ERF method can maintain stable edge-size extraction under variations in image quality and target appearance.

For all additional experiments, the same camera, lens, image resolution, distortion-correction procedure, edge-size extraction pipeline, and monocular ranging model were used. The exposure setting, focus state, and camera installation geometry were kept unchanged during each group of experiments. The acquisition distances covered 50–300 cm and were divided into five distance intervals: 50–100 cm, 100–150 cm, 150–200 cm, 200–250 cm, and 250–300 cm. For each experimental group, 10 images were collected in each distance interval, resulting in 50 images per group. The compared methods were selected as representative methods, including the pixel-level Canny baseline, a fixed-window ERF fitting method, SER-CIS, and the proposed dynamic-window ERF method. The real-image acquisition settings for robustness evaluation are summarized in Table 9.

The distance sampling scheme used in the additional real-image experiments is summarized in Table 10. This interval-based acquisition was used to evaluate whether the ranging errors remain stable over near-, medium-, and long-distance conditions.

The SNR-related imaging-condition groups are listed in Table 11. G1 was acquired under strong illumination and low camera gain and therefore corresponds to the highest image quality among the tested groups. From G1 to G4, the illumination intensity was gradually reduced and the camera gain was increased to maintain target visibility. This setting introduces stronger noise amplification and lower edge-image quality, thereby providing a practical test of the robustness of different edge localization methods under degraded imaging conditions.

The target-diversity groups are listed in Table 12. These groups were designed to evaluate the influence of target appearance changes, including shape, gray-level contrast, and surface reflectance, on edge-size extraction and monocular ranging accuracy. The square targets had a physical side length of 13 cm, and the circular target had a physical diameter of 13 cm. For square targets, the image size was defined as the estimated side length. For the circular target, the image size was defined as the fitted circle diameter. To avoid bias caused by the different geometric definitions of square side length and circular diameter, the corresponding physical target size was used in the same focal-length-based ranging model: the square side length was used for square targets, and the circle diameter was used for the circular target.

For each image, the reference distance

Z_{i}

was measured by the laser rangefinder, and the estimated distance

{\hat{Z}}_{i}

was calculated using the monocular ranging model. The signed ranging error and absolute ranging error were defined as

e_{i} = {\hat{Z}}_{i} - Z_{i},

(36)

a_{i} = | e_{i} | .

(37)

Based on the errors of all valid images, the mean absolute error (MAE), root mean square error (RMSE), standard deviation (SD), median error, interquartile range (IQR), maximum error, and mean relative error were calculated for each method:

MAE = \frac{1}{N} \sum_{i = 1}^{N} a_{i},

(38)

RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} e_{i}^{2}},

(39)

MRE = \frac{1}{N} \sum_{i = 1}^{N} \frac{a_{i}}{Z_{i}} \times 100 % .

(40)

{SD}_{a} = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(a_{i} - \bar{a})}^{2}},

(41)

where

\bar{a}

denotes the mean absolute error over all valid images. The SD and IQR were used to characterize the fluctuation and concentration of the absolute-error distribution, while the median error was reported to reduce the influence of extreme values.

4.4.4. Qualitative Visual Analysis on Real Images

To provide a more intuitive evaluation of the proposed edge-localization procedure on real images, this subsection presents representative qualitative visual examples. In addition to the quantitative ranging errors reported in the following subsections, the visual results are used to show how edge fragments are extracted from real target images and how the proposed subpixel refinement improves the localization of the target boundary. Two representative target shapes, namely the black matte square target and the black matte circular target, are selected for visualization because they contain straight and curved boundaries, respectively.

Figure 8 shows the edge-localization process and method comparison on real images. For each target shape, six visual results are presented, including the original ROI, the cropped local edge region, the pixel-level Canny edge fragments, the fixed-window ERF subpixel edge points, the SER-CIS subpixel edge points, and the proposed dynamic-window ERF subpixel edge points.

The qualitative results in Figure 8 show that the proposed method can provide stable edge localization on real images with different boundary geometries. In the square-target example, the cropped edge region contains a visible gray-level transition band between the bright background and the dark target. The fixed-window ERF points are slightly shifted toward the dark target side, whereas the SER-CIS points are slightly closer to the bright background side. In comparison, the proposed dynamic-window ERF points are located near the center of the transition band and show better consistency with the apparent target boundary. This suggests that the entropy-guided dynamic window can reduce the localization bias caused by an unsuitable fixed sampling range. In addition, the square target mainly contains straight edge segments, and the refined edge points follow the local edge direction more consistently than the pixel-level edge fragments. For the circular target, the boundary contains continuous curvature, and the refined edge points still maintain a stable distribution along the curved edge. These observations indicate that the proposed dynamic-window ERF refinement is applicable to both straight and curved target boundaries.

The pixel-level edge fragments provide only a coarse description of the target boundary, and local discontinuities or fluctuations may directly affect the estimated target size. In contrast, the subpixel-refined edge points better describe the actual edge transition and provide a more reliable basis for subsequent line fitting, circle fitting, and edge-size extraction. This visual observation is consistent with the quantitative results in the following distance-measurement analysis, where subpixel edge-localization methods achieve lower ranging errors than the pixel-level CannyPixel method. Therefore, the added qualitative examples further support the applicability of the proposed method to real monocular distance measurement. It should be noted that the visual differences among subpixel methods may not be very prominent in Figure 8, because the localized edge points are represented by small red markers and the differences among subpixel methods are often below the pixel scale in clean edge regions. Therefore, the qualitative examples are mainly used to illustrate the edge-localization process, the continuity of the refined edge points, and the difference between pixel-level edge fragments and subpixel-refined edge points. The detailed performance differences among fixed-window ERF, SER-CIS, and the proposed method are further evaluated using the quantitative error statistics in the following subsections.

4.4.5. Distance-Measurement Results and Analysis

Table 13 reports the overall error statistics of different edge localization methods in the real monocular distance-measurement experiment. In addition to MAE and RMSE, the standard deviation (SD), median error, interquartile range (IQR), and maximum error were also reported to provide a more comprehensive evaluation of ranging accuracy, error variation, and reliability. The pixel-level CannyPixel method achieved an MAE of 4.694 cm, an RMSE of 7.500 cm, and a mean relative error of 2.291%, which were considerably larger than those of the subpixel edge localization methods. This indicates that when integer-pixel edge positions are used to calculate the target image size, edge quantization errors are directly propagated to the monocular ranging results, especially when the target image size is small.

Canny–Devernay achieved subpixel correction through gradient-response interpolation and improved the ranging result compared with pixel-level Canny. Its MAE decreased to 3.420 cm, and its RMSE decreased to 5.678 cm. However, this method mainly depends on the local gradient peak position. When edge blur, non-uniform gray-level transitions, or local noise perturbations are present in real images, the gradient peak may shift, and the overall ranging error remains relatively large.

In contrast, HagaraAEF2011, SigmoidLogisticFit2023, SER-CIS, and ProposedDynamicERF_4P use gray-level transition or stable-region information for subpixel localization and can reduce integer-pixel quantization errors to some extent. HagaraAEF2011 achieved an MAE of 1.034 cm and an RMSE of 1.523 cm, while SigmoidLogisticFit2023 achieved an MAE of 1.056 cm and an RMSE of 1.545 cm. SER-CIS, as a recent stable-region-based subpixel localization baseline, achieved an MAE of 1.267 cm and an RMSE of 1.833 cm. ProposedDynamicERF_4P achieved the lowest MAE, RMSE, SD, median error, maximum error, and mean relative error, namely 0.976 cm, 1.475 cm, 1.119 cm, 0.394 cm, 4.291 cm, and 0.504%, respectively. These results indicate that the proposed method can extract the target edge size more stably in real monocular ranging tasks, thereby reducing the overall distance-measurement error.

It should be noted that real distance-measurement errors are not determined only by edge localization accuracy. They may also be affected by camera calibration error, imperfect parallelism between the target plane and the camera imaging plane, and the offset between the laser rangefinder reference point and the camera optical center. Therefore, the real experiment in this study is mainly used to compare the relative performance of different edge localization methods under the same acquisition conditions, the same distortion-correction results, and the same distance-measurement model.

Table 13 further shows that, compared with CannyPixel, Canny–Devernay, HagaraAEF2011, SigmoidLogisticFit2023, and SER-CIS, ProposedDynamicERF_4P reduced the MAE by approximately 79.21%, 71.48%, 5.63%, 7.65%, and 22.97%, respectively. In terms of RMSE, the corresponding reductions were approximately 80.33%, 74.02%, 3.14%, 4.51%, and 19.53%, respectively. These results demonstrate that the proposed method has a clear advantage over pixel-level localization and gradient-interpolation-based subpixel correction, while also providing a certain improvement over conventional gray-level fitting-based and stable-region-based subpixel methods.

Figure 9 further illustrates the distribution of absolute ranging errors. The proposed method shows the lowest median error and a relatively compact error distribution. Although SigmoidLogisticFit2023 obtained a slightly smaller IQR, the proposed method achieved the best overall balance among MAE, RMSE, SD, median error, maximum error, and mean relative error. This indicates that the normalized-gradient-entropy-guided dynamic-window ERF method provides improved overall accuracy and reliability in real monocular ranging.

To further analyze the variation of distance-measurement error over different ranges, Table 14 lists the mean absolute errors with standard deviation in different distance intervals. The interval-based statistics are used mainly to observe the error trend with distance, whereas the overall indicators in Table 13 are used to evaluate the comprehensive ranging performance over all samples. Because image quality, edge blur, local illumination, and target image size may differ among distance ranges in real experiments, local fluctuations may occur for different fitting models in individual intervals. Therefore, this study does not use the best result in a single interval as the only evaluation criterion; instead, the overall indicators and interval-based trends are considered together. The valid test images were divided into five distance intervals according to their reference distances.

As shown in Table 14, the mean absolute errors of all methods generally increased as the reference distance increased. This phenomenon is consistent with the error-propagation characteristic of the monocular distance model. According to

Z = f_{x} L / l

, as the target distance increases, the target image size l decreases, and the same magnitude of edge localization error produces a larger distance error. It should be emphasized that the distance intervals in Table 14 and Table 15 are used only to summarize the error trend over different ranges and do not indicate that the images were acquired at fixed distance intervals. In the distance ranges above 200 cm, the errors of CannyPixel and Canny–Devernay increased substantially, indicating that pixel-level localization and local-gradient-interpolation-based subpixel correction are less stable when the target occupies fewer pixels at longer distances.

By contrast, the errors of HagaraAEF2011, SigmoidLogisticFit2023, SER-CIS, and ProposedDynamicERF_4P increased more slowly, suggesting that subpixel methods based on gray-level transition or stable-region information can more effectively reduce edge-position quantization error. In the ≤100–200 cm range, ProposedDynamicERF_4P achieved the lowest mean absolute error in all three distance intervals, indicating that dynamic-window ERF fitting can extract the target edge size more accurately under near- and medium-distance conditions. In the 200–300 cm range, the proposed method produced errors close to those of HagaraAEF2011 and SigmoidLogisticFit2023 and remained lower than SER-CIS. Although it was not the best in every individual long-distance interval, it remained substantially better than CannyPixel and Canny–Devernay. This result indicates that the advantage of the proposed method is mainly reflected in overall error control and stable performance over most distance intervals, rather than in a local optimum in a single interval.

Figure 10 further shows the trend of ranging RMSE with standard deviation over different reference-distance intervals. The error curves of CannyPixel and Canny–Devernay increase rapidly with distance, especially beyond 200 cm. In comparison, the gray-level-model-based subpixel fitting methods and SER-CIS show flatter error curves. ProposedDynamicERF_4P maintains the lowest error in the near- and medium-distance ranges and remains close to the other fitting-based methods beyond 200 cm. This suggests that the normalized-gradient-entropy-guided dynamic window can adjust the sampling range according to local edge-transition characteristics in real images, improving the adaptability of ERF fitting under different distance conditions.

In addition to absolute error, relative error reflects the proportion of the ranging error with respect to the reference distance and is useful for comparing error stability under different distance conditions. Table 15 reports the mean relative errors in different distance intervals.

As shown in Table 15, the proposed method achieved the lowest mean relative error in the ≤100–200 cm range, indicating that it reduced not only the absolute error but also the error proportion relative to the reference distance in the near- and medium-distance ranges. In the 200–300 cm range, the mean relative error of ProposedDynamicERF_4P was close to those of HagaraAEF2011 and SigmoidLogisticFit2023 and was substantially lower than those of CannyPixel and Canny–Devernay. It was also lower than SER-CIS in all distance intervals. This indicates that gray-level-model-based fitting methods generally have better error stability when the target image size decreases at longer distances, while the proposed method still achieved the lowest mean relative error in the overall statistics.

Combining Table 13, Table 14 and Table 15, ProposedDynamicERF_4P achieved the lowest MAE, RMSE, SD, median error, maximum error, and mean relative error over all real ranging samples, indicating the lowest comprehensive distance-measurement error in an overall statistical sense. The interval-based results further show that the proposed method has clear advantages in the near- and medium-distance ranges and remains close to other fitting-based subpixel methods in the long-distance range, while clearly outperforming pixel-level Canny and Canny–Devernay. These results demonstrate that the subpixel localization advantage obtained in synthetic edge experiments can be effectively translated into reduced distance-measurement error in real monocular ranging.

In summary, the real monocular distance-measurement experiment further verifies the practical application value of the proposed method. Compared with pixel-level edge localization, gradient-interpolation-based correction, conventional gray-level fitting methods, and the stable-region-based SER-CIS baseline, the proposed normalized-gradient-entropy-guided dynamic-window ERF fitting method can adaptively adjust the sampling range according to local edge-transition characteristics in real images, thereby improving the stability and accuracy of target edge-size extraction. The results indicate that the proposed method is not only accurate and robust in synthetic edge localization tasks but also has potential for application in low-cost monocular visual ranging systems.

4.4.6. Robustness Analysis Under Different SNR-Related Imaging Conditions

To evaluate the robustness of different methods under varying image quality, additional real-image experiments were conducted under four SNR-related imaging conditions. The imaging conditions were adjusted by changing illumination intensity and camera gain, while the camera position, focus, exposure setting, target type, target distance, and monocular ranging model were kept unchanged. This setting allows the influence of image quality and noise level on edge-size extraction to be analyzed independently.

Table 16 reports the ranging error statistics under different SNR-related imaging conditions. The MAE, RMSE, SD, median error, IQR, and maximum error were calculated for each method in each group. The SD and IQR were used to characterize the error fluctuation and distribution concentration under each imaging condition. In the additional real-image experiments, Fixed-window ERF denotes the fixed 11-point four-parameter ERF fitting method.

Figure 11 further illustrates the RMSE variation of different methods under the four SNR-related imaging conditions.

As shown in Table 16 and Figure 11, the ranging errors of all methods are affected by the change in SNR-related imaging conditions. The pixel-level Canny method shows the largest RMSE values in all groups, with RMSE values of 6.058 cm, 6.389 cm, 7.256 cm, and 8.477 cm from G1 to G4, respectively. Its SD and maximum errors are also relatively large, indicating that integer-pixel edge localization is unstable when gray-level fluctuation and edge degradation are present. Since the monocular ranging model directly depends on the estimated target image size, pixel-level edge quantization and edge-detection fluctuations can be amplified into large ranging errors.

Compared with CannyPixel, the subpixel methods significantly reduce the ranging error under all SNR-related imaging conditions. In G1, Fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P achieve RMSE values of 1.655 cm, 1.760 cm, and 1.354 cm, respectively. ProposedDynamicERF_4P also obtains the lowest MAE, RMSE, SD, median error, IQR, and maximum error in this group, indicating that the proposed dynamic-window ERF fitting provides more accurate and concentrated edge-size estimation under high-quality imaging conditions.

As the imaging condition changes from G1 to G4, the errors of all methods generally increase. The RMSE of ProposedDynamicERF_4P increases from 1.354 cm in G1 to 2.166 cm in G4, while the RMSE of Fixed-window ERF increases from 1.655 cm to 2.854 cm, and that of SER-CIS increases from 1.760 cm to 2.205 cm. This trend shows that image-quality degradation affects all subpixel edge localization methods, but the proposed method maintains the lowest RMSE in all four groups. The results suggest that the normalized-gradient-entropy-guided window selection can adapt to changes in the local edge-transition profile and reduce the influence of gray-level fluctuation on the fitted edge position.

In G2 and G3, ProposedDynamicERF_4P achieves RMSE values of 1.514 cm and 1.689 cm, respectively, which are lower than those of CannyPixel, Fixed-window ERF, and SER-CIS. This indicates that the proposed method remains robust under moderate SNR-related image degradation. In contrast, Fixed-window ERF uses a fixed sampling window, and its performance becomes more sensitive when the local edge-transition width and gray-level distribution change. SER-CIS uses stable edge regions to suppress local interference, and its error growth is relatively moderate; however, its RMSE values remain slightly higher than those of the proposed method in all groups.

In the most degraded condition G4, all methods show increased errors. The RMSE values of CannyPixel, Fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P are 8.477 cm, 2.854 cm, 2.205 cm, and 2.166 cm, respectively. Although the difference between SER-CIS and ProposedDynamicERF_4P becomes small in this group, the proposed method still achieves the lowest RMSE and remains substantially better than CannyPixel and Fixed-window ERF. This result indicates that the dynamic-window ERF strategy remains competitive even under low-quality imaging conditions, while the stable-region-based SER-CIS method also shows good robustness in the severely degraded case.

Overall, the SNR-related experiment shows that subpixel edge localization is necessary for stable monocular ranging under changing imaging quality. ProposedDynamicERF_4P achieves the lowest RMSE in G1–G4 and maintains a relatively compact error distribution, as reflected by its lower SD and IQR values. These results indicate that the proposed dynamic-window ERF strategy improves the robustness of edge-size extraction under practical SNR-related imaging variations.

4.4.7. Target Diversity Analysis Under Different Shapes and Materials

To further evaluate the applicability of the proposed method to different target appearances, additional real-image experiments were conducted using targets with different shapes, gray-level contrasts, and surface materials. In this experiment, the camera, focus, exposure setting, illumination/gain condition, distance-measurement model, and evaluation metrics were kept consistent. Only the target appearance was changed. This experiment was designed to examine whether the proposed method can maintain stable edge localization and target-size extraction when the edge contrast, surface reflectance, or target geometry changes. The four target groups are defined as follows: T1 denotes the black matte square target, T2 denotes the black matte circular target, T3 denotes the gray matte square target, and T4 denotes the glossy black square target.

Table 17 reports the ranging error statistics for different target types. The compared methods include CannyPixel, fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P. The target-diversity experiment focuses on representative baselines rather than all methods, because its purpose is to evaluate the robustness of the proposed dynamic-window strategy under target appearance changes.

Figure 12 further illustrates the RMSE variation of different methods under the four target types.

As shown in Table 17 and Figure 12, target appearance has a clear influence on ranging accuracy. The pixel-level CannyPixel method shows the largest error in all target groups. Its RMSE values are 6.063 cm, 6.545 cm, 6.850 cm, and 9.656 cm for T1–T4, respectively. The maximum error also increases from 14.234 cm for the black matte square target to 34.745 cm for the glossy black square target. This indicates that integer-pixel edge extraction is highly sensitive to weak edge contrast, local edge discontinuity, and specular reflection. Since the monocular ranging model is inversely related to the extracted target image size, even a small error in pixel-level edge localization can be amplified into a larger distance error, especially when the target becomes smaller at longer distances.

Compared with CannyPixel, the three subpixel methods significantly reduce the ranging error under all target types. For the black matte square target T1, the RMSE values of Fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P are 1.738 cm, 1.746 cm, and 1.496 cm, respectively. The proposed method also obtains the lowest MAE, median error, IQR, and maximum error in this group. This result is consistent with the characteristics of T1: the black matte square target provides a high-contrast and weak-reflection edge, and its straight boundaries produce relatively stable one-dimensional gray-level transitions along the edge-normal direction. Under this condition, the normalized-gradient-entropy-guided dynamic window can select a suitable fitting range and reduce the influence of unnecessary background or target-side samples on the ERF fitting result.

For the black matte circular target T2, SER-CIS and ProposedDynamicERF_4P show close performance. SER-CIS achieves an RMSE of 1.576 cm, while ProposedDynamicERF_4P achieves an RMSE of 1.674 cm. Although the proposed method is slightly higher in RMSE in this group, the difference is small. This behavior is reasonable because the circular target uses edge points distributed around the full circumference for diameter estimation. The circle-fitting process can average local edge localization errors over many radial directions, which benefits both SER-CIS and the proposed method. In contrast to square targets, where the final size is determined by four fitted side lines, the circular target has stronger global averaging in the geometric fitting stage. Therefore, the advantage of the dynamic ERF window is less pronounced for T2 than for T1.

For the gray matte square target T3, all methods show increased errors compared with T1. The RMSE of CannyPixel increases to 6.850 cm, and the RMSE values of Fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P are 2.135 cm, 1.983 cm, and 1.888 cm, respectively. The degradation is mainly caused by the lower gray-level contrast between the target and the background. A weaker edge contrast reduces the gradient magnitude around the boundary and makes the edge-transition region more vulnerable to image noise and local gray-level fluctuation. Nevertheless, ProposedDynamicERF_4P still achieves the lowest RMSE in T3. This suggests that the entropy-guided window selection can partly compensate for the instability caused by weak edge transitions by adapting the ERF fitting window to the local gradient distribution.

The most challenging case is the glossy black square target T4. In this group, the RMSE values of CannyPixel, Fixed-window ERF, SER-CIS, and ProposedDynamicERF_4P are 9.656 cm, 2.935 cm, 2.294 cm, and 2.355 cm, respectively. The error increase is mainly caused by specular reflection and non-uniform surface brightness. These effects may introduce additional local gradients inside or near the target boundary and make the edge gray-level transition deviate from an ideal monotonic ERF profile. As a result, both pixel-level edge detection and intensity-based subpixel localization become more difficult. Compared with the fixed-window ERF method, SER-CIS and ProposedDynamicERF_4P are more stable in this group. SER-CIS obtains a slightly lower RMSE than the proposed method, while the proposed method remains very close and achieves a comparable error distribution. This indicates that the proposed method is not always the lowest for every individual appearance condition, but it maintains competitive robustness under the most difficult reflective case.

Overall, the target-diversity experiment confirms that subpixel edge localization is necessary for stable monocular ranging under changes in target geometry, gray-level contrast, and surface material. The proposed method achieves the lowest RMSE for T1 and T3 and comparable performance to SER-CIS for T2 and T4. Across all target types, ProposedDynamicERF_4P provides better overall accuracy than CannyPixel and fixed-window ERF, demonstrating that normalized-gradient-entropy-guided window selection improves the adaptability of ERF fitting under non-uniform edge-transition conditions. Meanwhile, the results also show that strong specular reflection remains a challenging factor, because it can alter the local gray-level profile and reduce the reliability of edge-size extraction even for subpixel methods.

5. Conclusions and Future Work

5.1. Conclusions

This study proposed a normalized-gradient-entropy-guided dynamic-window ERF subpixel edge localization method for addressing insufficient edge localization accuracy in monocular distance measurement. The method uses an ERF model to describe blurred edge gray-level transitions and employs normalized gradient entropy to characterize the local gradient distribution. The ERF fitting window is adaptively selected from 5, 7, 9, and 11 sampling points, thereby improving localization stability under different edge conditions.

Synthetic edge experiments showed that ProposedDynamicERF_4P achieved the lowest overall RMSE and MAE among the compared localization methods under mixed multi-scenario conditions, namely 0.163 pixel and 0.124 pixel, respectively. It outperformed CannyPixel, Canny–Devernay, HagaraAEF2011, SigmoidLogisticFit2023, and other comparison methods in terms of overall localization accuracy. The fixed-window ablation experiments further demonstrated that the dynamic-window strategy can alleviate the limited adaptability of fixed sampling windows under different blur levels and complex edge conditions.

In the standard black-square real ranging experiment, the proposed method achieved a mean absolute error of 0.976 cm, an RMSE of 1.475 cm, and a mean relative error of 0.504% over all test samples, all of which were the lowest among the compared methods. Additional real-image experiments under SNR-related imaging variations and target-diversity conditions further showed that the proposed method maintained competitive and generally stable performance under changes in image quality, target shape, contrast, and surface material. These results indicate that the subpixel localization advantage obtained by the proposed method on synthetic edge images can be translated into reduced ranging error in practical measurements. From a practical application perspective, the proposed method is most suitable for calibrated monocular vision systems, especially fixed-focus or well-calibrated industrial cameras used with artificial targets of known physical size. It is recommended for short- to medium-distance monocular ranging or visual dimensional measurement tasks in which the target occupies a sufficient number of pixels, the boundary can be clearly extracted, and the local gray-level transition remains approximately monotonic. Extremely small targets, very long-distance measurements, very low contrast, severe noise, strong blur, or highly reflective surfaces may reduce the benefit of subpixel refinement because the estimated target size becomes more sensitive to edge-point fluctuations. Therefore, the method is particularly suitable for applications requiring subpixel-level edge-size extraction and centimeter-level or sub-centimeter-level ranging improvement while maintaining a low-cost and interpretable measurement system. When the camera, lens, illumination condition, image contrast, or target material changes substantially, recalibration of the entropy thresholds is recommended before deployment.

5.2. Future Work

Although the proposed method achieved favorable results in both synthetic edge localization experiments and real monocular distance-measurement experiments, several issues remain for further study. First, the dynamic-window thresholds in this study were mainly calibrated using synthetic edge images. Although the experimental results show that these thresholds are effective in the real ranging scenario considered here, edge gray-level transition characteristics may vary under different cameras, lenses, illumination conditions, and target materials. Future work will investigate adaptive threshold calibration strategies so that the window selection rule can be automatically updated according to actual imaging conditions, thereby improving the generalization ability of the method across different visual systems.

Second, although additional real-image experiments were conducted using targets with different shapes, contrasts, and surface materials, the tested targets were still regular artificial targets with relatively simple geometric boundaries. In practical industrial measurement and robotic vision scenarios, target edges may include irregular contours, partial occlusions, strong background texture, non-uniform illumination, specular reflection, and weak or non-monotonic gray-level transitions. Future work will extend the proposed method to irregular targets, curved edges, partially occluded boundaries, and complex backgrounds to evaluate its applicability in more general visual measurement tasks. It should also be noted that the ERF model assumes that the local edge transition can be approximated by a Gaussian-blurred step edge. Under non-Gaussian blur, asymmetric motion blur, strong reflection, or non-monotonic gray-level transitions, the measured edge profile may deviate from the ideal ERF shape. In such cases, the proposed dynamic-window strategy can reduce the influence of inappropriate sampling ranges, but it cannot completely eliminate the model mismatch caused by an unsuitable edge-transition model. Future extensions may incorporate skewed ERF models, asymmetric point-spread-function models, motion-kernel-based fitting, or robust two-dimensional fitting to better handle non-Gaussian edge profiles.

Third, the proposed method requires nonlinear ERF fitting. Although the dynamic-window strategy can reduce redundant sampling to some extent, the computational cost of four-parameter ERF fitting is still higher than that of some interpolation-based and moment-based methods. For applications with strict real-time requirements, future research can further optimize algorithm efficiency through improved parameter initialization, iterative optimization strategies, parallel computation, and lightweight modeling. For example, the two-parameter ERF form can be adopted in scenarios with lower accuracy requirements or more stable edge conditions to balance localization accuracy and computational efficiency.

Finally, the real ranging experiment in this study was mainly performed under stable indoor illumination and approximately fronto-parallel target placement. In practical applications, target pose variation, illumination changes, motion blur, and camera installation errors may affect edge-size extraction and ranging accuracy. Future work will consider target pose compensation, multi-frame fusion, camera extrinsic constraints, or depth priors to further improve the stability and robustness of the method in complex real-world scenarios. Overall, the normalized-gradient-entropy-guided dynamic-window ERF subpixel edge localization method proposed in this study shows good accuracy and application potential for monocular distance measurement. Future research will focus on adaptive parameter calibration, extension to complex scenes, and real-time implementation.

Author Contributions

Conceptualization, Y.L. and H.Z.; methodology, Y.L.; software, Y.L.; validation, Y.L. and Y.P.; formal analysis, Y.L.; investigation, Y.L. and Y.P.; resources, H.Z.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, Y.P. and H.Z.; visualization, Y.L.; supervision, H.Z.; project administration, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank the School of Intelligent Systems Engineering, Sun Yat-sen University, for providing experimental support.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ERF	Error function
SNR	Signal-to-noise ratio
RMSE	Root mean square error
MAE	Mean absolute error
PAE	Partial area effect

References

Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
Sun, R.; Lei, T.; Chen, X.; Wang, J.; Meng, Y.; Nandi, A.K. Survey of image edge detection. Front. Signal Process. 2022, 2, 826967. [Google Scholar] [CrossRef]
Smith, S.M.; Brady, J.M. SUSAN—A new approach to low level image processing. Int. J. Comput. Vis. 1997, 23, 45–78. [Google Scholar] [CrossRef]
Dollár, P.; Zitnick, C.L. Structured forests for fast edge detection. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 1841–1848. [Google Scholar]
Lou, Q.; Lv, J.H.; Wen, L.H.; Xiao, J.Y.; Zhang, G.X.; Hou, X. High-precision camera calibration method based on subpixel edge detection. Acta Opt. Sin. 2022, 42, 90–98. (In Chinese) [Google Scholar]
Li, D.; Ye, Z.; Tang, J.; Wang, X. Visual measurement of valve opening area with improved subpixel edge location. Measurement 2022, 198, 111410. [Google Scholar] [CrossRef]
Poyraz, A.G.; Kaçmaz, M.; Gürkan, H.; Dirik, A.E. Sub-pixel counting based diameter measurement algorithm for industrial machine vision. Measurement 2024, 225, 114063. [Google Scholar]
Mei, Z.; Shen, T.; Zhang, H.; Wei, C. A subpixel visual measurement method for shaft manufacturing process. Measurement 2025, 256, 118349. [Google Scholar] [CrossRef]
Li, Y.; Hu, Y.; Wang, Q. Precision workpiece edge defect and dimensional detection based on sub-pixel anomaly prediction. Results Eng. 2025, 28, 108074. [Google Scholar] [CrossRef]
Zeng, M.J.; Wang, C.X.; Lai, J.J.; Chen, Y.H.; Chen, Z.W.; Yan, B.G.; Ren, H.L. Review of subpixel edge detection algorithms. Opt. Precis. Eng. 2024, 32, 3513–3524. (In Chinese) [Google Scholar] [CrossRef]
Tabatabai, A.J.; Mitchell, O.R. Edge location to subpixel values in digital imagery. IEEE Trans. Pattern Anal. Mach. Intell. 1984, PAMI-6, 188–201. [Google Scholar] [CrossRef]
Nalwa, V.S.; Binford, T.O. On detecting edges. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 699–714. [Google Scholar] [CrossRef]
Hagara, M.; Kulla, P. Edge detection with sub-pixel accuracy based on approximation of edge with erf function. Radioengineering 2011, 20, 516–524. [Google Scholar]
Huang, C.; Jin, W.; Xu, Q.; Liu, Z.; Xu, Z. Sub-pixel edge detection algorithm based on Canny–Zernike moment method. J. Circuits Syst. Comput. 2020, 29, 2050238. [Google Scholar] [CrossRef]
Trujillo-Pino, A.; Krissian, K.; Alemán-Flores, M.; Santana-Cedrés, D. Accurate subpixel edge location based on partial area effect. Image Vis. Comput. 2013, 31, 72–90. [Google Scholar] [CrossRef]
Shimizu, M.; Okutomi, M. Sub-pixel estimation error cancellation on area-based matching. Int. J. Comput. Vis. 2005, 63, 207–224. [Google Scholar] [CrossRef]
Ye, J.; Fu, G.; Poudel, U.P. High-accuracy edge detection with blurred edge model. Image Vis. Comput. 2005, 23, 453–467. [Google Scholar] [CrossRef]
Cantatore, A.; Cigada, A.; Sala, R.; Zappa, E. Hyperbolic tangent algorithm for periodic effect cancellation in sub-pixel resolution edge displacement measurement. Measurement 2009, 42, 1226–1232. [Google Scholar] [CrossRef]
Sun, Q.; Hou, Y.; Tan, Q. A subpixel edge detection method based on an arctangent edge model. Optik 2016, 127, 5702–5710. [Google Scholar] [CrossRef]
Fabijańska, A. Gaussian-based approach to subpixel detection of blurred and unsharp edges. Ann. Comput. Sci. Inf. Syst. 2014, 2, 641–650. [Google Scholar]
Yao, J.C.; Shen, J. Objective image quality assessment based on image content contrast perception. Acta Phys. Sin. 2020, 69, 148702. (In Chinese) [Google Scholar] [CrossRef]
He, Z.H.; Wang, B.G.; Liao, Y.B. Research on ideal edge generation methods. Opt. Precis. Eng. 2002, 10, 89–93. (In Chinese) [Google Scholar]
Devernay, F. A Non-Maxima Suppression Method for Edge Detection with Sub-Pixel Accuracy; INRIA Research Report RR-2724; INRIA: Sophia Antipolis, France, 1995.
Zhang, B.; Cheng, S.; Shuang, Z.; Cai, Y. Sub-pixel edge detection based on Logistic function fitting. In Proceedings of the 2023 3rd International Conference on Electronic Information Engineering and Computer Science (EIECS), Changchun, China, 22–24 September 2023. [Google Scholar]
Yang, Y.; Liang, G.; Wang, X.; Wang, K.; Wang, C.; Wu, X. Subpixel edge localization based on converted intensity summation under stable edge region. arXiv 2025, arXiv:2502.16502. [Google Scholar] [CrossRef]
Pu, M.; Huang, Y.; Liu, Y.; Guan, Q.; Ling, H. EDTER: Edge detection with transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 1402–1412. [Google Scholar]
Jie, J.; Guo, Y.; Wu, G.; Wu, J.; Hua, B. EdgeNAT: Transformer for efficient edge detection. arXiv 2024, arXiv:2408.10527. [Google Scholar] [CrossRef]
Han, Y.X.; Zhang, Z.S.; Dai, M. Monocular vision measurement method for target ranging. Opt. Precis. Eng. 2011, 19, 1110–1117. (In Chinese) [Google Scholar]
Nogueira, V.V.E.; Barca, L.F.; Pimenta, T.C. A cost-effective method for automatically measuring mechanical parts using monocular machine vision. Sensors 2023, 23, 5994. [Google Scholar] [CrossRef] [PubMed]
Hu, H.; Zheng, X.; Yin, J.; Wang, Y. Research on O-ring dimension measurement algorithm based on cubic spline interpolation. Appl. Sci. 2021, 11, 3716. [Google Scholar] [CrossRef]
Ye, X.; Wang, F.; Yang, Q.; Hu, X.; Meng, J.; Song, L. Research on the precision measurement method of flat screen gap based on mobile vision. Appl. Sci. 2023, 13, 6909. [Google Scholar] [CrossRef]
Liu, Y.; Li, T.F. Research on the improvement of Zhang’s camera calibration method. Opt. Tech. 2014, 40, 565–570. (In Chinese) [Google Scholar] [CrossRef]

Figure 1. Ideal and measured edge gray-level distributions. (a) Ideal step-edge model. (b) Measured gray-level transition around a real checkerboard edge.

Figure 2. Example of gray-level transition fitting and the corresponding subpixel edge position.

Figure 3. Validation of normalized gradient entropy for characterizing edge blur. (a) Distribution under different blur parameters. (b) Relationship between

H_{n}

and blur under different SNR conditions.

Figure 3. Validation of normalized gradient entropy for characterizing edge blur. (a) Distribution under different blur parameters. (b) Relationship between

H_{n}

and blur under different SNR conditions.

Figure 4. Dynamic sampling-window selection ratios guided by normalized gradient entropy under different scenarios.

Figure 5. Localization RMSE under additive perturbations of the entropy thresholds.

Figure 6. Box plots of absolute localization errors for different subpixel edge localization methods.

Figure 7. RMSE comparison of different subpixel edge localization methods under different synthetic edge scenarios.

Figure 8. Qualitative visual comparison of edge localization on real target images. The first two rows show the black matte square target, and the last two rows show the black matte circular target. For each target, the displayed results include the original ROI, the cropped local edge region, the pixel-level Canny edge fragments, the fixed-window ERF subpixel edge points, the SER-CIS subpixel edge points, and the proposed dynamic-window ERF subpixel edge points.

Figure 9. Box plots of absolute ranging errors for different methods in the real monocular distance-measurement experiment.

Figure 10. Ranging RMSE with standard deviation at different distance intervals.

Figure 11. Ranging RMSE with standard deviation under different SNR-related imaging conditions.

Figure 12. Ranging RMSE with standard deviation for different target types.

Table 1. Mapping between normalized gradient entropy and ERF fitting-window size.

Entropy Range	Edge Characteristic	Sampling Points
$H_{n} < 0.630$	Concentrated gradient; relatively sharp edge	5
$0.630 \leq H_{n} < 0.795$	Slightly blurred edge	7
$0.795 \leq H_{n} < 0.885$	Moderately blurred edge	9
$H_{n} \geq 0.885$	Dispersed gradient; relatively blurred edge	11

Table 2. Synthetic edge image scenarios and parameter settings.

Scenario	Blur Parameter Setting	Additional Degradation	Number
Ideal	$σ_{b} = 0.5, 1.0, 1.5, 2.0, 2.5$	None	180
Slope	$σ_{b} = 0.5, 1.5, 2.5$	Background gray-level slope	108
Texture	$σ_{b} = 0.5, 1.5, 2.5$	Local texture disturbance	108
Asymmetric	$(0.6, 1.2), (0.9, 1.8), (1.25, 2.5)$	Different blur scales on two sides	108

Table 3. Dynamic window selection ratios under different scenarios.

Scenario	Edges	Mean Points	5-pt/%	7-pt/%	9-pt/%	11-pt/%
Ideal	10,800	8.8935	14.046	18.417	26.352	41.185
Slope	6480	8.6840	20.926	11.219	30.586	37.269
Texture	6480	8.7525	18.457	13.333	30.340	37.870
Asymmetric	6480	8.3744	8.3488	25.648	54.938	11.065

Table 4. Sensitivity analysis of entropy thresholds under additive perturbations.

$Δ$	$τ_{1}^{'}$	$τ_{2}^{'}$	$τ_{3}^{'}$	RMSE/pixel	$Δ_{RMSE}$ %
−0.04	0.590	0.755	0.845	0.13578	0.57
−0.03	0.600	0.765	0.855	0.13559	0.43
−0.02	0.610	0.775	0.865	0.13531	0.23
−0.01	0.620	0.785	0.875	0.13528	0.20
0	0.630	0.795	0.885	0.13501	0.00
0.01	0.640	0.805	0.895	0.13652	1.12
0.02	0.650	0.815	0.905	0.13887	2.86
0.03	0.660	0.825	0.915	0.14017	3.83
0.04	0.670	0.835	0.925	0.14150	4.81

Table 5. Overall performance comparison between fixed-window and dynamic-window ERF fitting.

Method	Mean RMSE/pixel	Mean MAE/pixel	Mean Post-Canny Time/ms
ERF2P-fixed5	0.23975	0.19693	0.23491
ERF2P-fixed7	0.18944	0.17834	0.24523
ERF2P-fixed9	0.17491	0.16953	0.26643
ERF2P-fixed11	0.17343	0.15081	0.29790
ProposedDynamicERF_2P	0.16436	0.14114	0.24659
ERF4P-fixed5	0.28192	0.21682	0.63502
ERF4P-fixed7	0.21905	0.16458	0.64782
ERF4P-fixed9	0.16146	0.15328	0.67700
ERF4P-fixed11	0.15604	0.13414	0.68958
ProposedDynamicERF_4P	0.14646	0.12358	0.66690

Table 6. Statistical comparison of localization errors for different subpixel edge localization methods.

Method	MAE/pixel	RMSE/pixel	SD/pixel	Median/pixel	IQR/pixel	Max/pixel	Time/ms
CannyPixel	0.32858	0.41897	0.25995	0.30642	0.48076	1.41420	0.00450
Canny–Devernay	0.19505	0.30547	0.23509	0.12874	0.18170	2.13770	0.19557
Canny–Zernike2020	0.16494	0.20864	0.12778	0.13101	0.18125	1.13960	0.18648
GLM1984	0.17533	0.22587	0.14239	0.12977	0.19969	1.30640	0.17145
HagaraAEF2011	0.13415	0.17334	0.10978	0.10617	0.13074	1.21370	1.35190
PAE2013	0.13596	0.17720	0.11365	0.10870	0.12574	1.20500	0.17523
GaussianFit	0.19841	0.27450	0.18970	0.13713	0.25127	1.43460	0.63207
ArctanFit2016	0.12935	0.17044	0.11099	0.10077	0.12306	1.24230	0.96565
SigmoidLogisticFit2023	0.13162	0.17141	0.10982	0.10393	0.12841	1.21400	0.98061
SER-CIS	0.17341	0.22030	0.13587	0.13307	0.20508	0.97289	1.38510
ProposedDynamicERF_2P	0.14114	0.18067	0.11280	0.11266	0.13888	1.26520	0.49560
ProposedDynamicERF_4P	0.12368	0.16312	0.10635	0.09684	0.10670	1.21370	1.31280

Table 7. Wilcoxon signed-rank test results for paired sample-level MAE between ProposedDynamicERF_4P and baseline methods.

Baseline Method	Pairs	Baseline Median/pixel	Median Improvement/pixel	p-Value
CannyPixel	504	0.32998	0.21332	$2.82 \times 10^{- 84}$
Canny–Devernay	504	0.12515	0.02821	$9.00 \times 10^{- 45}$
Canny–Zernike2020	504	0.15748	0.03559	$7.52 \times 10^{- 52}$
GLM1984	504	0.14688	0.02476	$3.51 \times 10^{- 63}$
HagaraAEF2011	504	0.10439	0.00000	$3.91 \times 10^{- 7}$
PAE2013	504	0.11918	0.01063	$2.04 \times 10^{- 41}$
GaussianFit	504	0.20187	0.06449	$5.26 \times 10^{- 69}$
ArctanFit2016	504	0.10599	0.00290	$9.80 \times 10^{- 10}$
SigmoidLogisticFit2023	504	0.10418	0.00031	$4.96 \times 10^{- 5}$
SER-CIS	504	0.14563	0.03118	$9.49 \times 10^{- 50}$
ProposedDynamicERF_2P	504	0.11588	0.00774	$1.40 \times 10^{- 32}$

Table 8. RMSE comparison of different subpixel edge localization methods under four scenarios.

Method	Ideal	Slope	Texture	Asymmetric
CannyPixel	0.41463	0.41472	0.42200	0.42728
Canny–Devernay	0.29180	0.30350	0.31226	0.32238
Canny–Zernike2020	0.18373	0.19160	0.21586	0.25214
GLM1984	0.15558	0.16200	0.21970	0.35103
HagaraAEF2011	0.13737	0.14329	0.18790	0.23007
PAE2013	0.15641	0.16743	0.20223	0.19192
GaussianFit	0.27846	0.27340	0.29372	0.24776
ArctanFit2016	0.14467	0.15171	0.19278	0.20127
SigmoidLogisticFit2023	0.13913	0.14428	0.18835	0.22038
SER-CIS	0.15197	0.15795	0.21976	0.33875
ProposedDynamicERF_2P	0.13958	0.14616	0.19260	0.24781
ProposedDynamicERF_4P	0.13923	0.14504	0.18656	0.18978

Table 9. Real-image acquisition settings for robustness evaluation.

Experiment	Variable	Groups	Distance Range	Images per Group	Fixed Conditions
SNR-related robustness	Illumination intensity and camera gain	G1–G4	50–300 cm, five intervals	50	Camera, focus, exposure, target type, and distance model
Target diversity	Target shape, contrast, and material	T1–T4	50–300 cm, five intervals	50	Camera, focus, exposure, illumination/gain, and distance model

Table 10. Distance sampling scheme for each additional experimental group.

Distance Interval/cm	Number of Images	Purpose
50–100	10	Near-distance evaluation
100–150	10	Near-to-medium-distance evaluation
150–200	10	Medium-distance evaluation
200–250	10	Medium-to-long-distance evaluation
250–300	10	Long-distance evaluation

Table 11. Definitions of the SNR-related imaging-condition groups.

Group	Illumination Condition	Gain	Exposure	Expected Image Quality
G1	Strong illumination	20	Fixed	Highest image quality
G2	Normal illumination	40	Fixed	Medium-high image quality
G3	Weak illumination	60	Fixed	Medium image quality
G4	Dark illumination	80	Fixed	Low image quality

Table 12. Definitions of the target-diversity groups.

Group	Target Shape	Physical Size	Target Material/Surface
T1	Square	Side length = 13 cm	Black matte surface
T2	Circle	Diameter = 13 cm	Black matte surface
T3	Square	Side length = 13 cm	Gray matte surface
T4	Square	Side length = 13 cm	Glossy black surface

Table 13. Statistical comparison of ranging errors in the real monocular distance-measurement experiment.

Method	MAE/cm	RMSE/cm	SD/cm	Median/cm	IQR/cm	Max/cm	Mean Relative Error/%
CannyPixel	4.694	7.500	5.916	2.225	6.866	20.911	2.291
Canny–Devernay	3.420	5.678	4.583	1.219	4.948	20.186	1.654
HagaraAEF2011	1.034	1.523	1.131	0.405	1.402	4.615	0.545
SigmoidLogisticFit2023	1.056	1.545	1.140	0.521	1.295	4.822	0.566
SER-CIS	1.267	1.833	1.340	0.747	2.193	5.078	0.669
ProposedDynamicERF_4P	0.976	1.475	1.119	0.394	1.434	4.291	0.504

Table 14. Mean absolute distance-measurement error with standard deviation at different distance intervals. Values are reported as MAE ± SD.

Reference Distance/cm	CannyPixel/cm	Canny–Devernay/cm	HagaraAEF2011/cm	SigmoidLogisticFit2023/cm	SER-CIS/cm	ProposedDynamicERF_4P/cm
≤100	0.219 ± 0.099	0.139 ± 0.079	0.152 ± 0.118	0.170 ± 0.121	0.156 ± 0.067	0.132 ± 0.118
100–150	1.398 ± 0.660	0.794 ± 0.508	0.321 ± 0.230	0.395 ± 0.252	0.476 ± 0.332	0.249 ± 0.209
150–200	5.101 ± 2.091	4.375 ± 1.531	1.485 ± 0.358	1.458 ± 0.342	1.868 ± 0.641	1.302 ± 0.270
200–250	12.201 ± 6.609	7.324 ± 3.531	2.050 ± 0.494	1.917 ± 0.398	2.616 ± 1.398	1.964 ± 0.305
250–300	15.518 ± 3.313	11.969 ± 5.191	3.078 ± 1.135	3.160 ± 1.305	3.476 ± 1.048	3.190 ± 0.829

Table 15. Mean relative error under different distance intervals.

Reference Distance/cm	CannyPixel/%	Canny–Devernay/%	HagaraAEF2011/%	SigmoidLogisticFit2023/%	SER-CIS/%	ProposedDynamicERF_4P/%
≤100	0.293	0.203	0.205	0.225	0.214	0.182
100–150	1.154	0.645	0.266	0.327	0.386	0.216
150–200	2.812	2.426	0.838	0.822	1.040	0.735
200–250	5.440	3.265	0.916	0.858	1.156	0.882
250–300	5.916	4.604	1.177	1.209	1.331	1.216

Table 16. Ranging error statistics under different SNR-related imaging conditions.

SNR-Related Group	Method	MAE/cm	RMSE/cm	SD/cm	Median/cm	IQR/cm	Max/cm
G1	CannyPixel	4.567	6.058	4.084	3.606	4.221	18.171
G1	Fixed-window ERF	1.232	1.655	1.134	1.121	1.371	4.736
G1	SER-CIS	1.227	1.760	1.295	0.889	1.244	5.824
G1	ProposedDynamicERF_4P	0.964	1.354	0.975	0.733	1.016	4.217
G2	CannyPixel	4.804	6.389	4.321	3.786	4.432	19.268
G2	Fixed-window ERF	1.413	1.884	1.280	1.294	1.583	5.283
G2	SER-CIS	1.331	1.887	1.373	0.973	1.361	6.137
G2	ProposedDynamicERF_4P	1.098	1.514	1.070	0.847	1.173	4.548
G3	CannyPixel	5.346	7.256	5.034	4.147	4.854	22.782
G3	Fixed-window ERF	1.608	2.153	1.468	1.469	1.797	6.094
G3	SER-CIS	1.481	2.059	1.467	1.100	1.538	6.472
G3	ProposedDynamicERF_4P	1.215	1.689	1.204	0.931	1.290	5.163
G4	CannyPixel	6.262	8.477	5.862	4.868	5.698	26.483
G4	Fixed-window ERF	2.175	2.854	1.896	2.024	2.476	7.581
G4	SER-CIS	1.587	2.205	1.571	1.178	1.648	6.926
G4	ProposedDynamicERF_4P	1.544	2.166	1.559	1.174	1.626	6.743

Table 17. Ranging error statistics for different target types.

Target Group	Method	MAE/cm	RMSE/cm	SD/cm	Median/cm	IQR/cm	Max/cm
T1	CannyPixel	4.757	6.063	3.845	3.700	4.350	14.234
T1	Fixed-window ERF	1.273	1.738	1.210	1.212	1.424	4.823
T1	SER-CIS	1.179	1.746	1.317	0.956	1.345	5.234
T1	ProposedDynamicERF_4P	1.018	1.496	1.121	0.815	1.125	4.456
T2	CannyPixel	4.830	6.545	4.515	3.842	4.376	21.110
T2	Fixed-window ERF	1.438	1.961	1.363	1.425	1.558	5.823
T2	SER-CIS	1.055	1.576	1.197	0.931	1.263	5.134
T2	ProposedDynamicERF_4P	1.130	1.674	1.262	1.047	1.370	5.348
T3	CannyPixel	4.860	6.850	4.936	3.945	4.438	22.936
T3	Fixed-window ERF	1.590	2.135	1.457	1.526	1.731	6.345
T3	SER-CIS	1.396	1.983	1.440	1.152	1.385	5.788
T3	ProposedDynamicERF_4P	1.343	1.888	1.357	1.150	1.250	5.678
T4	CannyPixel	6.416	9.656	7.379	4.685	4.937	34.745
T4	Fixed-window ERF	2.202	2.935	1.985	2.131	1.792	9.456
T4	SER-CIS	1.551	2.294	1.728	1.436	1.869	7.567
T4	ProposedDynamicERF_4P	1.594	2.355	1.772	1.470	1.797	7.899

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Pan, Y.; Zhang, H. Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement. Appl. Sci. 2026, 16, 5843. https://doi.org/10.3390/app16125843

AMA Style

Liu Y, Pan Y, Zhang H. Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement. Applied Sciences. 2026; 16(12):5843. https://doi.org/10.3390/app16125843

Chicago/Turabian Style

Liu, Yuhao, Yuzhe Pan, and Hui Zhang. 2026. "Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement" Applied Sciences 16, no. 12: 5843. https://doi.org/10.3390/app16125843

APA Style

Liu, Y., Pan, Y., & Zhang, H. (2026). Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement. Applied Sciences, 16(12), 5843. https://doi.org/10.3390/app16125843

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Normalized-Gradient-Entropy-Guided Dynamic-Window Error Function Fitting for Subpixel Edge Localization in Monocular Distance Measurement

Featured Application

Abstract

1. Introduction

2. Edge Gray-Level Distribution Model and Applicability of the ERF Function

2.1. Optical Imaging Model of Edge Gray-Level Transitions

2.2. Common Edge Fitting Functions and Their Applicability

3. Normalized-Gradient-Entropy-Guided Dynamic-Window ERF Subpixel Edge Localization

3.1. Localization Pipeline

3.2. ERF Model Fitting and Parameter Initialization

3.3. Normalized-Gradient-Entropy-Guided Adaptive Sampling-Window Selection

3.4. Theoretical Justification for Entropy-Guided Window Selection

4. Results and Discussion

4.1. Synthetic Edge Image Construction and Experimental Parameter Settings

4.2. Ablation Study of the Normalized-Gradient-Entropy-Guided Dynamic Window

4.2.1. Dynamic Window Selection Results

4.2.2. Sensitivity Analysis of Entropy Thresholds

4.2.3. Comparison Between Fixed and Dynamic Windows

4.3. Accuracy and Efficiency Comparison with Other Subpixel Edge Localization Methods

4.4. Real Monocular Distance Measurement Experiment and Result Analysis

4.4.1. Experimental Platform and Camera Calibration

4.4.2. Monocular Distance Model and Edge-Size Extraction

4.4.3. Experimental Settings and Evaluation Metrics

4.4.4. Qualitative Visual Analysis on Real Images

4.4.5. Distance-Measurement Results and Analysis

4.4.6. Robustness Analysis Under Different SNR-Related Imaging Conditions

4.4.7. Target Diversity Analysis Under Different Shapes and Materials

5. Conclusions and Future Work

5.1. Conclusions

5.2. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI