3. Statistical Analysis of Measurement Data
To assess whether the measurements of individual gauge blocks obtained at different magnifications differ significantly, a statistical analysis was performed. The first step consisted of evaluating whether the datasets corresponding to each magnification follow a normal distribution. The normality assessment was performed using the Lilliefors test, which is a modification of the Kolmogorov–Smirnov test suitable when the population mean and variance are unknown [
1]. The null hypothesis assumes that the data follow a normal distribution; if the resulting
p-value exceeds 0.05, the null hypothesis is not rejected.
Subsequently, the homogeneity of variances among groups (magnifications) was examined using Levene’s test [
56]. This test evaluates whether the variances are statistically equal across the compared groups. If the
p-value is greater than 0.05, the assumption of equal variances is accepted.
Depending on the outcomes of both preliminary tests, one of the following was selected for further analysis:
One-way ANOVA, if all datasets were normally distributed and variances were homogeneous, or
Kruskal–Wallis test, if at least one dataset violated the assumption of normality or variance homogeneity.
The results of the normality and homogeneity tests are summarized in
Table 4. For each gauge block, the
p-values of the Lilliefors test are presented for all magnifications, together with the
p-value of Levene’s test and the final decision on the appropriate statistical test. For
p-values below the reporting resolution of the statistical software, results are indicated using threshold notation
p < 0.001.
The
p-values in
Table 4 belong to the Lilliefors normality test and the Levene test for homogeneity of variances. Although each group contains the same number of repeated measurements (
n = 50), the
p-values differ because the test statistics are sensitive to the underlying distributional characteristics of each magnification setting. For the Lilliefors test, variation in
p-values arises from differences in distributional shape, particularly skewness, kurtosis, and the discrete nature of pixel-based measurements, which often deviate slightly from Gaussian behavior in optical microscopy. For the Levene test,
p-values reflect differences in within-group dispersion; magnification-dependent variance changes are expected due to focus stability, edge-detection sensitivity and optical resolution effects, leading naturally to non-uniform variance structure across magnifications. Consequently, even with identical sample sizes, the distributional and dispersion characteristics produce varying
p-values for the Lilliefors and Levene tests. These effects are well documented in meteorological data, where measurement distributions may deviate from strict normality and display heteroscedasticity across measurement conditions [
52,
53,
54,
55,
56,
57,
58,
59,
60].
Although data transformation techniques such as logarithmic or Box–Cox transformations were considered, they did not consistently restore normality nor reduce heteroscedasticity among the magnification groups. Similar observations have been reported in optical and dimensional metrology studies, where transformation often fails to address distributional deviations caused by small-scale edge-detection variability and magnification-dependent contrast behavior [
61,
62]. Furthermore, applying a transformed scale would reduce the interpretability of the results in a practical measurement context. For these reasons, and given the robustness of rank-based non-parametric methods, the Kruskal–Wallis test was selected as the most appropriate approach.
Since the preliminary analysis indicated that none of the datasets satisfied the assumptions of normality and homogeneity of variances, all subsequent evaluations were performed using the non-parametric Kruskal–Wallis test. This test is a rank-based equivalent of one-way ANOVA and is suitable for comparing more than two independent groups without assuming normal data distribution [
63].
The Kruskal–Wallis test was applied separately to the measurements of each gauge block. The null hypothesis states that all magnification groups originate from the same population, i.e., their median values are statistically equal. If the resulting
p-value is lower than 0.05, the null hypothesis is rejected, indicating that at least one magnification level differs significantly from the others. Representative results of the Kruskal–Wallis analysis are summarized in
Table 5.
For each gauge block, the MATLAB R2024b (The MathWorks, Inc., Natick, MA, USA) script generated:
the mean and median of the measured values at each magnification,
a boxplot illustrating the data distribution,
and the overall p-value of the Kruskal–Wallis test.
When statistically significant differences were found (p < 0.05), a post hoc comparison using MATLAB’s multcompare() function was automatically performed to identify which magnifications differed from each other.
For all measured gauge blocks, the p-values were below 0.05, indicating that the results obtained at different magnifications differ significantly. This suggests that the optical magnification has a measurable influence on the detected values, likely due to calibration precision, pixel resolution, and microscope optics.
In addition to numerical results, graphical outputs illustrate the data distribution across magnifications using boxplots. These visualizations highlight the variation trends and confirm the statistical findings.
The boxplot is a graphical representation of data distribution based on five key statistical descriptors: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. The central box represents the interquartile range (IQR), which contains the middle 50% of all measured values. The horizontal line inside the box indicates the median (Q2), showing the central tendency of the dataset. The “whiskers” extend to the most extreme data points that are not considered outliers, typically up to 1.5 × IQR from the box boundaries.
Red crosses represent outliers, i.e., data points lying outside the 1.5 × IQR. These points may result from random measurement variability or systematic effects. The height and position of each box show both the dispersion and the central value of the data for a given magnification.
Comparing boxes across magnifications provides insight into how the measurement results shift or spread with changing optical magnification. Differences in median positions indicate systematic shifts, while differences in box heights (IQR) or whisker lengths suggest changes in measurement repeatability or precision.
Figure 5 illustrates the distribution of 50 measurements of the 1.0 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
p < 0.0001, indicating statistically significant differences between the magnifications.
The median measured values for each magnification are: 1× = 0.9957 mm, 2× = 0.9993 mm, 3× = 0.9972 mm, 4× = 0.9999 mm, and 5× = 0.9959 mm (see the table in the figure). The data show that the measured value slightly depends on the applied magnification. The smallest magnifications (1× and 3×) tend to underestimate the nominal length, whereas higher magnifications (especially 4×) yield results closer to or slightly above the nominal value.
The spread of the boxes also differs among magnifications, suggesting that measurement repeatability varies with magnification. The smallest spread (narrow box) occurs at 4× magnification, indicating the most consistent results, while larger spreads at 1× and 3× magnifications imply higher variability, possibly due to lower effective resolution or calibration uncertainty.
Outliers (marked with red crosses) were observed for magnifications 2×, 3×, and 4×, but their occurrence is infrequent and does not significantly affect the median values.
Overall, the boxplot confirms that magnification has a statistically significant effect on the measured length. The optimal precision and consistency for the 1.0 mm gauge block were achieved at 4× magnification.
Figure 6 presents the distribution of 50 measurements of the 1.1 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
, confirming statistically significant differences between the measurement groups corresponding to different magnifications.
The median measured values for each magnification are: 1× = 1.1023 mm, 2× = 1.1019 mm, 3× = 1.0989 mm, 4× = 1.0992 mm, and 5× = 1.1001 mm (see the table in the figure). The results show that the measured values slightly depend on the applied magnification, with the smallest differences observed between 3×, 4×, and 5×. The 1× magnification produces the highest median value, indicating a minor overestimation of the nominal length, while higher magnifications yield values closer to the nominal size.
The variation in box height across the magnifications suggests that the repeatability of measurements improves with increasing magnification. The narrowest boxes are observed at 4× and 5× magnification, reflecting better measurement stability and consistency. In contrast, a slightly larger spread at 1× magnification indicates higher measurement variability, likely due to reduced resolution and alignment precision.
Outliers (marked by red crosses) appear across all magnifications, most notably at 1× and 5×, but their influence on the median values remains negligible.
Overall, the boxplot confirms a statistically significant influence of magnification on the measured length of the 1.1 mm gauge block. The most stable and precise measurement results were achieved at 4× and 5× magnification, where the distribution is narrowest and closest to the nominal value.
Figure 7 shows the distribution of 50 measurements of the 1.2 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
, confirming statistically significant differences among the measurement groups.
The median measured values for each magnification are: 1× = 1.1987 mm, 2× = 1.2015 mm, 3× = 1.2006 mm, 4× = 1.1999 mm, and 5× = 1.1987 mm (see the table in the figure). The results reveal that the measured values fluctuate slightly with magnification. The measurements at 2× and 3× magnification exhibit slightly higher medians than the nominal value, while those at 1× and 5× are closer to or marginally below it.
The box heights (interquartile ranges) vary noticeably between magnifications, indicating differences in repeatability. The smallest spread is observed at 3× and 4× magnification, suggesting the best consistency and measurement stability at these settings. Conversely, the 1× and 5× magnifications show a wider spread, implying greater variability likely associated with lower resolution or focus precision.
Outliers (marked with red crosses) appear at all magnifications except 1×, most frequently at 2× and 4×, but they do not significantly affect the median values.
Overall, the boxplot demonstrates that magnification has a statistically significant effect on the measured length of the 1.2 mm gauge block. The most precise and stable results were obtained at 3× and 4× magnification, where both the variability and deviation from the nominal length are minimal.
Figure 8 shows the distribution of 50 measurements of the 1.3 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
p = 0.1221, indicating no statistically significant differences among the measurement groups.
The median measured values for each magnification are: 1× = 1.3011 mm, 2× = 1.2999 mm, 3× = 1.3003 mm, 4× = 1.3007 mm, and 5× = 1.3002 mm (see the table in the figure). The results show that all median values are very close to the nominal length of 1.3 mm, with deviations within ±0.002 mm.
The interquartile ranges are comparable across magnifications, suggesting similar repeatability and measurement stability. Slightly larger spread is observed at 1× and 2× magnifications, while 3×–5× exhibit more compact boxes, indicating marginally better consistency at higher magnifications.
Outliers (marked with red crosses) appear at all magnifications, most frequently at 1× and 2×, but they do not substantially affect the overall distribution or median values.
Overall, the boxplot demonstrates that magnification does not have a statistically significant effect on the measured length of the 1.3 mm gauge block. The measurements remain highly consistent across all magnifications, confirming stable measurement performance and accuracy for this scale.
Figure 9 presents the distribution of 50 measurements of the 1.4 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
p < 0.0001, confirming statistically significant differences among the measurement groups.
The median measured values for each magnification are: 1× = 1.3975 mm, 2× = 1.4008 mm, 3× = 1.4000 mm, 4× = 1.3999 mm, and 5× = 1.3997 mm (see the table in the figure). The medians show a slight upward shift at higher magnifications, with 2×–4× producing values slightly above the nominal length, while 1× is marginally below it.
The interquartile ranges are smallest at 3× and 4× magnification, indicating the best repeatability and measurement stability in these settings. The 1× group shows the widest spread, suggesting lower precision, likely due to reduced resolution.
Outliers (red crosses) are present across all magnifications, most frequently at 1× and 2×, but their effect on central tendency is minor.
Overall, the boxplot demonstrates that magnification significantly influences the measured length of the 1.4 mm gauge block. The most stable and accurate measurements were obtained at 3× and 4× magnification, where both variability and deviation from the nominal value are minimal.
Figure 10 presents the distribution of 50 measurements of the 1.5 mm gauge block obtained at magnifications of 1×, 2×, 3×, 4×, and 5×. The Kruskal–Wallis test yielded
p = 0.0208, indicating statistically significant differences among the measurement groups.
The median measured values for each magnification are: 1× = 1.4992 mm, 2× = 1.4993 mm, 3× = 1.4997 mm, 4× = 1.5007 mm, and 5× = 1.5003 mm (see the table in the figure). The results show a gradual increase in measured values with magnification, with the 4× and 5× groups slightly exceeding the nominal length of 1.5 mm.
The spread of measurements (interquartile range) is relatively consistent across magnifications, though 1× and 2× exhibit slightly higher variability compared to the higher magnifications. The smallest spread, indicating the most repeatable results, is observed at 4× magnification.
Only a few outliers are present, primarily at 1×, but they do not meaningfully affect the overall distribution.
A clear trend can be observed across all magnification levels. The narrowest interquartile ranges were consistently obtained at the 4× magnification, indicating the highest repeatability of the measurements. In contrast, the highest magnification setting exhibited slightly wider distributions, which is consistent with increased sensitivity to local optical artifacts, focus stability, and contrast variability at high zoom levels [
44]. These results support the conclusion that moderate magnification levels often provide the most favorable balance between image resolution and metrological stability in digital optical microscopy.
Overall, the boxplot demonstrates a weak but statistically significant influence of magnification on the measured length of the 1.5 mm gauge block. The most consistent and accurate results were achieved at 4× magnification, where variability is minimal, and the median value is closest to the nominal dimension.
After performing the Kruskal–Wallis test, all measurement sets with statistically significant results (p < 0.05) were subjected to post hoc multiple comparison analysis. This procedure identifies which specific magnification pairs differ significantly.
Each post hoc CSV file contains six columns corresponding to:
Group 1 index,
Group 2 index,
Lower bound of confidence interval,
Difference in group medians,
Upper bound of confidence interval,
p-value for the pairwise comparison.
Only the p-values (column 6) were used for further visualization. These values indicate the statistical significance of differences between pairs of magnifications (1×–5×). For easier interpretation, the p-values were arranged into a symmetric matrix, where the cell (i, j) represents the significance level between magnifications i× and j×.
To improve the visual contrast and highlight strong significance levels, the matrix values were transformed using the logarithmic scale −log10(p). A higher color intensity thus corresponds to a stronger statistical difference (i.e., smaller p-value).
The resulting heatmaps provide a compact overview of which magnification pairs differ significantly. The color scale reflects the −log10(p) values, while each cell is annotated with the corresponding p-value for direct readability. Along the axes, magnifications (1× to 5×) are shown both horizontally and vertically, forming a symmetric comparison grid.
These heatmaps serve as a visual summary of the post hoc analysis, enabling immediate identification of specific magnification pairs with statistically significant differences in measured length values.
The significance values displayed in the post hoc maps vary across magnification pairs because the underlying pairwise comparisons differ in both effect size and within-group variability. Although each magnification group contains the same number of repeated measurements (
n = 50), the magnitude of the median difference between magnifications is not constant, nor is the associated measurement dispersion. Pairwise comparisons with larger median differences and lower within-group variability yield smaller
p-values, whereas comparisons with small effect sizes or higher dispersion result in weaker statistical significance. In addition, pixel-based optical measurements exhibit discrete sampling effects and occasional deviations from ideal distributional assumptions, which further influence test sensitivity. The significance maps visualize
p-values on a −log10(
p) scale; consequently, relatively small numerical differences in
p-values may appear amplified in the graphical representation. These properties are inherent to rank-based post hoc testing and do not indicate inconsistency in the measurement process but rather reflect genuine differences in statistical separability between magnification levels [
63,
64].
Figure 11 shows the post hoc significance map for the 1.0 mm gauge block. The matrix visualizes the results of pairwise Kruskal–Wallis comparisons between measurements taken at different magnifications (1×–5×). Each cell corresponds to a comparison between two magnifications, and the color scale represents the negative logarithm of the
p-value (−log
10 p). Darker shades indicate statistically non-significant differences, while brighter regions correspond to highly significant differences (
p < 0.05).
From the heatmap, it is evident that several pairs exhibit strong statistical differences. In particular:
The pairs 1×–3×, 1×–4×, 2×–5×, and 3×–4× show p < 0.0001, indicating highly significant deviations between these magnifications.
The pairs 2×–3× (p = 0.0019) also show a significant difference.
Conversely, the comparisons 1×–2×, 2×–4×, 3×–5×, and 1×–5× have p > 0.05, suggesting that measurements at these magnifications are statistically similar.
These results imply that the measurement accuracy and repeatability of the microscope depend on the selected magnification. The largest discrepancies appear mainly between the medium (3×, 4×) and low magnifications (1×, 2×), which may reflect optical calibration effects or systematic scaling deviations in the image processing at different zoom levels.
Overall, the 1.0 mm gauge block results confirm that magnification has a statistically significant influence on the measured values for most pairwise comparisons. The visualization through the post hoc heatmap effectively highlights which magnification levels contribute most to the observed variability, supporting subsequent correction or calibration steps in the measurement procedure.
Figure 12 shows the post hoc significance map for the 1.1 mm gauge block. From the heatmap, it is evident that the most pronounced differences occur between the combinations 2×–3× (
p = 0.0002) and 2×–4× (
p = 0.0079), indicating substantial deviations when transitioning between lower and mid-level magnifications. Noticeable differences are also observed between 1×–3× (
p = 0.0477) and 3×–5× (
p = 0.0202), which suggest that the measurement values change significantly at these magnification levels. In contrast, comparisons such as 1×–2× (
p = 0.5591), 1×–4× (
p = 0.3619), 1×–5× (
p = 0.9984), 2×–5× (
p = 0.7438), 3×–4× (
p = 0.8875), and 4×–5× (
p = 0.2148) do not exhibit statistically significant differences, indicating consistent measurements across these magnifications. These findings suggest that for the 1.1 mm gauge block, the largest discrepancies occur mainly between low and mid magnifications, while measurements remain stable at the lowest and highest magnification levels.
Figure 13 shows the post hoc significance map for the 1.2 mm gauge block. The results reveal that the most prominent differences occur between 1× and the higher magnifications, particularly 1×–2×, 1×–3×, and 1×–4×, all with
p-values below 0.0001. These results indicate that the measurements obtained at the lowest magnification level differ substantially from those at medium and higher magnifications. A notable significant difference is also present between 2× and 5× (
p = 0.0189), suggesting a deviation between these magnification levels.
In contrast, several comparisons do not display statistically significant differences, such as 2×–3× (p = 0.7723), 2×–4× (p = 0.8482), 3×–4× (p = 0.9999), 3×–5× (p = 0.3203), and 4×–5× (p = 0.2459), indicating consistent measurement behavior across these magnification settings. The comparison between 1× and 5× (p = 0.0888) is close to the significance threshold but remains statistically non-significant.
Overall, the heatmap demonstrates that for the 1.2 mm gauge block, the largest discrepancies occur when comparing the lowest magnification (1×) with higher magnifications, while measurements among intermediate and higher magnifications remain relatively stable. This confirms that magnification continues to influence the measurement outcome, particularly at the transition from 1× to higher levels, and highlights the importance of calibration or correction procedures when using low magnification.
Figure 14 presents the post hoc significance map for the 1.4 mm gauge block. The most notable differences are observed when comparing the 1× magnification with higher magnifications. Specifically, the pairs 1×–2× and 1×–4× show
p-values below 0.0001, and the pair 1×–3× also reaches statistical significance with
p = 0.0080. A significant difference is further visible between 1× and 5× (
p = 0.0166), indicating that measurements taken at the lowest magnification consistently deviate from those at medium and high magnification settings.
In contrast, comparisons among higher magnifications show no statistically significant differences. Examples include 2×–3× (p = 0.1206), 2×–4× (p = 0.9753), 2×–5× (p = 0.0700), 3×–4× (p = 0.3841), 3×–5× (p = 0.9995), and 4×–5× (p = 0.2644). These values indicate stable and consistent measurement behavior from 2× upwards. The results therefore suggest that the primary source of variability is associated with the transition from 1× to higher magnifications, while the system performs uniformly at magnifications of 2× and above.
Overall, the 1.4 mm gauge block results confirm that magnification significantly influences the measurements, particularly when comparing the lowest magnification to the rest. The heatmap clearly highlights where the strongest deviations occur, supporting the need for calibration or compensation, especially at low magnification levels.
Figure 15 shows the post hoc significance map for the 1.5 mm gauge block, where pairwise Kruskal–Wallis comparisons between the magnification levels (1×–5×) are visualized. Unlike previous gauge block sizes, the 1.5 mm heatmap reveals fewer statistically significant differences between magnifications.
The only pair that exhibits a significant difference is 1×–4× (p = 0.0109), suggesting that measurements at the lowest magnification deviate notably from those at 4×. All other combinations yield p-values greater than 0.05, including comparisons such as 1×–2× (p = 0.9064), 1×–3× (p = 0.8531), 1×–5× (p = 0.4695), 2×–3× (p = 0.9999), 2×–4× (p = 0.1277), 2×–5× (p = 0.9395), 3×–4× (p = 0.1704), 3×–5× (p = 0.9688), and 4×–5× (p = 0.5059). These values indicate that most magnification levels produce statistically comparable measurements for the 1.5 mm gauge block.
Overall, the results suggest that at this gauge block thickness, the influence of magnification on the measurement accuracy is less pronounced compared to thinner blocks. Except for the deviation observed between 1× and 4×, the measurement system maintains consistent performance across magnifications. This indicates improved stability and reduced sensitivity to optical or scaling effects at this block size.
To determine whether the coverage factor (corresponding to approximately 95% coverage) could be applied for the evaluation of expanded uncertainty, a numerical analysis was performed based on the combination of normality testing and a bootstrap estimation of the mean. It was assumed that the measurement data could approximately follow a normal distribution, and this assumption was verified experimentally. For each gauge block and magnification level, the same evaluation procedure was executed.
The normality of the measured data is evaluated using several standard statistical tests. The script applies the Lilliefors, Anderson–Darling, and Jarque–Bera tests. These tests assess whether the empirical distribution of the data is consistent with a Gaussian model. In parallel, a bootstrap resampling of the mean value is performed with 5000 replicates. From the resulting bootstrap distribution, the 95% percentile confidence interval of the mean is derived, and its half-width represents the empirically estimated expanded uncertainty ().
The number of bootstrap replicates was set to 5000 to ensure stable estimation of the uncertainty metrics. Previous methodological studies have shown that several thousand bootstrap resamples are sufficient for the convergence of confidence intervals and variance-related estimates, while further increases in the number of replicates yield negligible improvements in accuracy [
59,
61,
65,
66]. Preliminary tests confirmed that increasing the number of bootstrap replicates beyond 5000 did not result in meaningful changes in the estimated uncertainty values. Therefore, 5000 replicates were considered an appropriate compromise between numerical stability and computational efficiency.
To further validate the assumption of normality, the standardized bootstrap means are compared with the theoretical standard normal distribution using the Kolmogorov–Smirnov test. If the p-value of this test exceeds the chosen significance level (), the bootstrap distribution can be considered statistically compatible with normality. In such cases, the conventional coverage factor is accepted as appropriate. If the bootstrap distribution deviates from normality, alternative coverage factors (e.g., ) or empirically derived confidence intervals may be considered instead.
The script automatically stores all calculated quantities and test results into an output spreadsheet and generates graphical reports (histograms, Q–Q plots, and bootstrap histograms) for each evaluated dataset.
Finally, the applicability of the assumption of a coverage factor
, corresponding to approximately 95% coverage, was evaluated for all measurement conditions. The results of the statistical tests and bootstrap analysis are summarized in
Table 6. The table presents, for each gauge block size and magnification, the
p-values obtained from the normality tests (Lilliefors, Anderson–Darling, Jarque–Bera), together with the
p-value from the Kolmogorov–Smirnov test applied to the standardized bootstrap means. The last column of the table indicates whether the
coverage factor can be used (TRUE) or not (FALSE) based on the obtained results. In this way,
Table 6 clearly demonstrates under which measurement conditions the assumption of normality—and thus the use of
—is statistically justified.
Table 6 summarizes the results of normality testing and bootstrap-based uncertainty analysis for measurements of gauge blocks at various nominal lengths and optical magnifications. The columns
p_Lillie,
p_AD, and
p_JB correspond to the
p-values obtained from the Lilliefors, Anderson–Darling, and Jarque–Bera normality tests applied to the raw measurement data, while
p_boot_KS represents the Kolmogorov–Smirnov test
p-value for the standardized bootstrap distribution of the mean.
The last column (USE k = 2?) indicates whether the assumption of normality was considered valid for estimating the expanded uncertainty using the conventional coverage factor k = 2 (corresponding to 95% confidence).
Results show that:
For most gauge blocks (1.0 mm, 1.2 mm, 1.3 mm, 1.4 mm, and 1.5 mm), the normality assumption holds across all magnifications (p > 0.05 in most tests), and thus k = 2 is applicable.
For the 1.1 mm gauge block, all magnifications yielded very low p-values (p < 0.01), indicating significant deviation from normality; therefore, k = 2 was not used.
The bootstrap Kolmogorov–Smirnov test (p_boot_KS) provided an additional consistency check—whenever p_boot_KS > 0.05, the bootstrap distribution of means was compatible with a normal distribution.
Only two cases (1.0 mm—4× and 1.2 mm—3×) showed non-normal bootstrap behavior (p_boot_KS < 0.05), resulting in k = 2 being rejected.
Overall, the bootstrap method confirmed that for the majority of measurements, the distribution of the mean is sufficiently close to normal, supporting the use of k = 2 for expanded uncertainty evaluation.
Figure 16 presents the statistical analysis of the measurement data for the 1.0 mm gauge block at 1× magnification. The left panel shows the probability density histogram of the raw data with an overlaid normal probability density function (red line). The visual fit indicates that the data are approximately normally distributed.
The Q–Q plot in the middle panel further confirms the near-linear alignment of sample quantiles with the theoretical normal quantiles, suggesting no significant deviation from normality.
The right panel displays the bootstrap distribution of the sample mean (5000 resamples), which is symmetric and bell-shaped. The Kolmogorov–Smirnov test for standardized bootstrap means yielded , confirming that the bootstrap distribution is statistically consistent with the normal distribution.
Therefore, for this measurement condition, the assumption of normality is justified, and the standard coverage factor can be used for expanded uncertainty estimation.
Similar graphs and statistical evaluations were processed and plotted for all scales used at each magnification considered. The results in all cases show analogous data behavior, which further supports the validity of the assumption of normality and the use of the standard coverage factor k = 2 in estimating the expanded uncertainty.
Let note the sample standard deviation of the repeated measurements and the number of valid repetitions. The components of uncertainty were treated as follows.
The type A (statistical) standard uncertainty is
Each type B contribution was assumed to follow a rectangular (uniform) distribution; therefore, the standard uncertainty for each maximum error
is
where the indices denote:
—maximum permissible error of the microscope,
—maximum permissible error of the gauge block, and
—maximum permissible error of the glass ruler used to calibrate the microscope.
The combined type B standard uncertainty is obtained by the root-sum-square rule:
The combined standard uncertainty of the measurement is then
Finally, the expanded uncertainty was calculated according to Formula (4), where the coverage factor was used to provide an approximate 95% level of confidence in the expanded uncertainty.
Figure 17 shows the relative contribution of each uncertainty component to the combined standard uncertainty
for all applied magnifications of the optical microscope. The components include the type A statistical uncertainty (
) and type B contributions originating from the microscope (
), the gauge block (
), and the calibration ruler (
).
Across all magnifications, the dominant contribution originates from the microscope uncertainty, accounting for approximately 76–88% of the total . The ruler contributes between 10% and 21%, while the gauge block and type A components remain below 2% each.
As the magnification increases from 1× to 5×, a slight decrease in the total combined uncertainty can be observed (from about 3.72 µm down to 2.60 µm). This indicates that higher magnification improves the measurement resolution and repeatability, reducing the statistical (type A) component. The detailed view on the right highlights the distribution for 5× magnification, where the microscope and ruler remain the major contributors.
Overall, the uncertainty budget confirms that the largest contribution originates from the MPE (Maximum Permissible Error) of the microscope, which is dependent on the applied magnification, while the other components (ruler, gauge block, and type A) have only minor influence on the total uncertainty.
Figure 18 shows the relative contribution of each uncertainty component to the combined standard uncertainty
for all magnifications of the optical microscope when measuring the 1.1 mm gauge block. Across all magnifications, the microscope uncertainty remains the dominant contributor, ranging approximately from 83% down to 49%, depending on the magnification. At lower magnifications (1×–3×), it clearly exceeds 80%, while at 4× magnification, its contribution decreases to about 49%, due to a noticeable increase in the type A component. The ruler uncertainty contributes between roughly 8% and 21%, representing the second most significant component across all magnifications. The gauge block and type A components remain below approximately 3% for magnifications up to 3×.
A notable change occurs at 4× magnification, where the type A uncertainty rises significantly to around 37%, indicating increased measurement variability or lower repeatability at this magnification. Despite this rise, at 5× magnification, the type A component decreases again to below 1%, while the microscope and ruler uncertainties regain dominance (approximately 78% and 21%, respectively).
The combined standard uncertainty shows a decreasing trend with increasing magnification, dropping from approximately 3.83 µm at 1× to 2.60 µm at 5× magnification. As with the 1.0 mm gauge block, this trend confirms that higher magnification improves measurement resolution and stability.
Overall, the uncertainty budget for the 1.1 mm gauge block confirms that the microscope’s MPE is the largest contributor to the total uncertainty. However, at 4× magnification, an exceptionally higher statistical (type A) component indicates reduced repeatability, which slightly increases the total uncertainty at this point before it drops again at 5×.
Figure 19 presents the percentage contribution of individual uncertainty components to the combined standard uncertainty
for all magnifications of the optical microscope during the measurement of the 1.2 mm gauge block. Across all magnifications, the microscope uncertainty remains the dominant contributor, accounting for approximately 77–89% of the total combined uncertainty. The ruler contributes between roughly 10% and 21%, representing the second most significant source of uncertainty. The contributions from the gauge block and type A components remain minimal, both below approximately 2% for all magnifications.
As the magnification increases from 1× to 5×, the combined standard uncertainty decreases from approximately 3.73 µm to 2.63 µm. This trend indicates that higher magnification improves measurement resolution and reduces the overall uncertainty. At 4× magnification, the total uncertainty reaches its lowest value of 2.61 µm before slightly increasing again at 5×.
The detailed view for 5× magnification confirms that the largest contributions still originate from the microscope (≈77%) and ruler (≈21%), while the gauge block and type A components remain negligible (below 2%).
Overall, the uncertainty budget for the 1.2 mm gauge block confirms that the microscope’s MPE is the primary source of uncertainty, while the effects from the ruler, gauge block, and statistical repeatability are minor in comparison.
Figure 20 illustrates the relative percentage contribution of each uncertainty component to the combined standard uncertainty
for various magnifications of the optical microscope when measuring the 1.3 mm gauge block. Across all magnifications, the microscope uncertainty remains the dominant contributor, accounting for approximately 78–89% of the total combined uncertainty. The ruler uncertainty contributes between 10% and 21%, while the contributions from the gauge block and type A components are negligible, remaining below 2% for all magnifications.
As the magnification increases from 1× to 5×, the combined standard uncertainty decreases from approximately 3.71 µm to 2.60 µm. The lowest total uncertainty is observed at 4× and 5× magnification (2.60 µm), confirming that higher magnification improves measurement repeatability and resolution. Despite the reduction in overall uncertainty, the structure of the uncertainty budget remains similar, with the microscope being the primary source of error.
The detailed view of the 5× magnification confirms that the microscope contributes roughly 78%, followed by the ruler (~21%), while the gauge block and type A components are below 1%.
Overall, the uncertainty analysis for the 1.3 mm gauge block once again confirms that the MPE of the microscope is the predominant factor in the total uncertainty, whereas other components—ruler, gauge block, and statistical repeatability—have only a minor influence.
Figure 21 illustrates the percentage share of individual uncertainty components contributing to the combined standard uncertainty
when measuring the 1.4 mm gauge block at different microscope magnifications. For all magnifications, the microscope uncertainty is clearly the dominant contributor, accounting for approximately 77–88% of the total uncertainty. The ruler contributes around 10–21%, while the gauge block and type A (statistical) components each remain below about 2%.
As the magnification increases from 1× to 5×, the combined standard uncertainty gradually decreases—from about 3.71 µm to 2.61 µm. The lowest uncertainty values are observed at 4× and 5× magnification, indicating improved resolution and more stable repeatability at higher magnifications.
The detailed view of the 5× magnification confirms that the largest contributors are still the microscope (≈77%) and the ruler (≈21%), while the gauge block and type A uncertainties are negligible.
Overall, the uncertainty assessment for the 1.4 mm gauge block confirms that the primary source of uncertainty is the microscope’s MPE. In contrast, the contributions from the ruler, gauge block, and statistical repeatability have only a minor impact on the total uncertainty.
Figure 22 illustrates the percentage share of individual uncertainty components contributing to the combined standard uncertainty
when measuring the 1.5 mm gauge block at different microscope magnifications. As in the previous case, the microscope uncertainty is the dominant contributor across all magnifications, accounting for approximately 77–88% of the total uncertainty. The ruler contributes between 10 and 21%, while the gauge block and type A (statistical) components each remain below about 2%.
With increasing magnification from 1× to 5×, the combined standard uncertainty gradually decreases—from about 3.73 µm to 2.61 µm. The lowest uncertainty values occur at 4× and 5× magnification, indicating improved measurement resolution and repeatability at higher magnifications.
The detailed view for the 5× magnification confirms that the microscope (≈77%) and the ruler (≈21%) remain the dominant contributors, while the gauge block and statistical components have a negligible influence.
Overall, the uncertainty analysis for the 1.5 mm gauge block confirms that the microscope’s maximum permissible error (MPE) is the primary source of measurement uncertainty, while the other components contribute only minimally to the total uncertainty.
Finally, graphs of the measured data were created for each scale. Each data point is supplemented with error bars representing the expanded measurement uncertainty (U). These plots enable a clear visual evaluation of how the measured values vary with the microscope magnification.
The nominal value in each graph corresponds to the certified value of the reference gauge block. Since the systematic deviation of the gauge block stated in its calibration certificate is very small (below 0.15% of the total measurement uncertainty), its contribution can be neglected. Therefore, the nominal value is considered equal to the etalon value (e.g., 1 mm, 1.1 mm, etc.).
In addition, the MPE of the microscope is indicated for each magnification. The ±MPE limits (shown as dashed lines) represent the tolerance interval within which all measurements should fall according to the manufacturer’s specification. This allows a direct comparison between the experimental results and the allowed instrument accuracy.
Figure 23 shows the dependence of the measured length on the microscope magnification (from 1× to 5×). The black solid line represents the nominal value of 1.000 mm, while the blue points correspond to the mean measured values with their expanded uncertainties.
As the magnification increases, the measured values fluctuate slightly around the nominal dimension, with deviations smaller than ±0.01 mm. The expanded uncertainty U gradually decreases with increasing magnification—from approximately ±0.0074 mm at 1× down to ±0.0052 mm at 4× and 5×.
The overall mean measured result across all magnifications is 0.9989 ± 0.0061 mm, which corresponds well to the nominal value of 1.000 mm. The difference from the nominal value (−1.1 µm) is significantly smaller than the applicable MPE limits (±6 µm to ±4 µm), confirming that all measured values are well within the specified tolerance.
This indicates good consistency of the measurement system and confirms that the microscope meets the declared accuracy across the entire magnification range. The smallest deviation from the nominal value was observed at 4× magnification, where the measured result was 1.0001 ± 0.0052 mm.
Figure 24 shows the dependence of the measured length on the microscope magnification (from 1× to 5×). The measured values vary slightly around the nominal dimension, with deviations smaller than ±0.005 mm. The expanded uncertainty decreases gradually with increasing magnification—from approximately ±0.0076 mm at 1× down to ±0.0052 mm at 5×.
The overall mean measured result across all magnifications is 1.0998 ± 0.0066 mm, which is in excellent agreement with the nominal value (difference −0.2 µm). All data points remain well within the ±MPE limits (±6 µm to ±4 µm), confirming the measurement system’s stability and accuracy.
The most accurate measurement was achieved at 5× magnification, where the measured result was 1.1004 ± 0.0052 mm.
Figure 25 shows the dependence of the measured length on the microscope magnification (from 1× to 5×). As magnification increases, the measured values fluctuate slightly around the nominal dimension, with deviations smaller than ±0.005 mm. The expanded uncertainty gradually decreases from ±0.0074 mm at 1× to ±0.0052 mm at 4× and 5×.
The overall mean measured result across all magnifications is 1.1988 ± 0.0063 mm, which corresponds very well to the nominal value (difference −1.2 µm). All data points are well within the ±MPE limits (±6 µm to ±4 µm), confirming stable measurement performance across the entire magnification range. The smallest deviation was observed at 4× magnification, where the measured result was 1.1998 ± 0.0052 mm.
Figure 26 shows the dependence of the measured length on the microscope magnification (from 1× to 5×). The measured values remain very close to the nominal dimension across all magnifications, with deviations smaller than ±0.002 mm. The expanded uncertainty decreases from ±0.0073 mm at 1× to ±0.0052 mm at 4× and 5× magnification.
The overall mean measured result is 1.3000 ± 0.0063 mm, which matches the nominal value almost exactly (difference < 0.1 µm). All data points fall well within the ±MPE limits (±6 µm to ±4 µm), confirming excellent stability and accuracy of the measurement system. The smallest uncertainty was again achieved at higher magnifications (4× and 5×).
Figure 27 presents the measured results for the 1.4 mm reference gauge block at different magnifications (1×–5×). The measured values show excellent repeatability, staying within a narrow range of 1.398–1.400 mm. The expanded measurement uncertainty decreases with increasing magnification—from ±0.0073 mm at 1× to ±0.0052 mm at 4× and 5×.
The mean measured value, 1.3995 ± 0.0063 mm, differs from the nominal value by less than 0.001 mm (0.07%). All results are well within the microscope’s ±MPE limits (±6 µm to ±4 µm), confirming consistent accuracy across all magnifications. The best performance (lowest uncertainty) is again observed at higher magnifications.
Figure 28 presents the measured results for the 1.5 mm reference gauge block at different magnifications (1×–5×). The measured values remain highly consistent, varying only within a narrow range from 1.4987 mm to 1.5002 mm. The expanded measurement uncertainty shows a clear improvement with increasing magnification—from ±0.0074 mm at 1× to ±0.0052 mm at 4× and 5×.
The mean measured value, 1.4996 ± 0.0063 mm, deviates from the nominal value by less than 0.0004 mm (0.03%), which indicates excellent agreement with the reference length. All measured results lie well within the microscope’s specified ±MPE limits (±6 µm to ±4 µm), confirming stable and accurate performance of the optical system. As with the previous case, the lowest uncertainty and best measurement stability are achieved at higher magnifications.
4. Discussion
The results obtained in this study demonstrate that the optical magnification of a digital microscope has a measurable effect on the accuracy, repeatability, and overall uncertainty of dimensional measurements based on pixel calibration. The statistical analyses, uncertainty budgets, and graphical evaluations provide a comprehensive view of how magnification influences the metrological performance of the microscope and the validity of the manufacturer’s declared Maximum Permissible Error (MPE).
The observed magnification-dependent behavior is inherently linked to the optical and sensor design of the investigated microscope. The system employs a non-telecentric optical configuration, for which changes in magnification are accompanied by variations in effective pixel scaling, field-dependent distortion, and depth-of-field characteristics. In addition, the digital image sensor and its pixel pitch influence the discretization of edge information and the stability of edge localization at different magnifications. Manufacturer-defined calibration parameters and MPE values further contribute to magnification-specific uncertainty behavior. Consequently, while the qualitative trends observed in this study are expected to be representative for similar digital optical microscopes, quantitative results may differ for systems employing telecentric optics, different sensor architectures, or alternative calibration strategies [
44,
54,
67,
68].
The Kruskal–Wallis analyses performed for all gauge blocks confirmed that magnification significantly affects the measured lengths for most nominal sizes, except for the 1.3 mm block, where no statistically significant difference was observed. This indicates that the calibration constant and image scaling factor vary slightly with magnification, producing small but measurable differences in the detected dimensions. These differences are most visible at low magnifications (1×–2×), where the optical resolution is lower, and the pixel size in the calibration model is larger. Consequently, the system becomes more sensitive to pixel interpolation and edge-detection uncertainty, resulting in higher variability and small systematic deviations.
At medium and higher magnifications (3×–5×), the results become more stable, with narrower boxplot distributions and smaller interquartile ranges. The medians of the measured values approach the nominal lengths, confirming that higher magnification improves the localization of edges and enhances repeatability. This behavior is consistent with theoretical expectations: as magnification increases, the effective pixel size decreases, and the optical system captures finer image details, thereby reducing random errors in distance estimation. However, beyond a certain magnification, the benefits diminish because the field of view decreases, and any small defocus or vibration can have a proportionally larger effect.
The post hoc heatmaps further clarify the relationships between magnifications. For smaller gauge blocks (1.0–1.2 mm), the largest deviations occur between the lowest (1×) and intermediate (3×–4×) magnifications, suggesting that the optical scaling model of the microscope is slightly non-linear across this range. For larger blocks (1.4 mm and 1.5 mm), differences between magnifications become less pronounced, implying that the impact of magnification diminishes as the measured length increases. The 1.3 mm block serves as a transition point where magnification no longer produces statistically significant differences, confirming the overall stability of the system in the mid-range of lengths and zoom levels.
The observed deviation for the 1.3 mm gauge block may be attributed to the interaction between the nominal block length and the discrete sampling characteristics of the imaging sensor. In pixel-based optical measurements, certain object dimensions may align unfavorably with the effective pixel pitch, leading to periodic sampling artifacts or reduced edge-localization stability [
54,
69]. In addition, the relationship between the gauge block length and the microscope’s field of view varies with magnification; at specific combinations of length and magnification, the edges may be positioned closer to regions affected by residual optical distortion or non-uniform illumination, which can locally influence measurement repeatability [
54,
69,
70,
71,
72]. These effects are length- and magnification-dependent and may therefore manifest as isolated deviations, such as the one observed for the 1.3 mm gauge block. Further targeted experiments would be required to isolate and quantify these contributions.
A crucial part of the uncertainty evaluation was to determine whether the assumption of normal distribution could be used for calculating the expanded uncertainty with the coverage factor k = 2. The combination of classical normality tests (Lilliefors, Anderson–Darling, Jarque–Bera) and bootstrap resampling provided a robust assessment of the data distribution.
The results summarized in
Table 6 show that in the majority of cases, the bootstrap distributions of the mean were statistically compatible with normality (
p > 0.05), supporting the use of k = 2 for expanded uncertainty estimation. Only a few combinations—most notably the 1.1 mm gauge block and two isolated cases (1.0 mm at 4× and 1.2 mm at 3×)—showed significant deviations from normality. These exceptions likely originate from non-random sources, such as small calibration drifts or quantization effects in the digital image processing algorithm. The fact that non-normal behavior was not systematic across magnifications confirms that the microscope performs consistently under stable environmental conditions.
The bootstrap approach also verified that the empirical confidence intervals of the mean correspond closely to those predicted by the standard normal coverage factor. This finding is essential for traceability because it validates that the conventional GUM-based uncertainty evaluation remains applicable to digital optical microscopes when sufficient repeated measurements are taken. In practice, this means that for well-controlled image-based dimensional measurements, the assumption of Gaussian uncertainty propagation remains valid despite the discrete nature of pixel-based data.
The uncertainty budgets (
Figure 17,
Figure 18,
Figure 19,
Figure 20,
Figure 21 and
Figure 22) reveal that the dominant contribution to the combined standard uncertainty originates from the microscope’s declared MPE, which accounts for approximately 75–90% of the total. The calibration ruler contributes around 10–20%, while the gauge block and statistical type A components together remain below 2%. This composition is typical for optical measuring systems, where the instrumental error of the microscope outweighs the uncertainty of reference standards and statistical variation.
Across all gauge blocks, the combined standard uncertainty decreases systematically with increasing magnification—from approximately 3.7 µm at 1× to 2.6 µm at 4× and 5×. This trend confirms that higher magnification improves spatial resolution and reduces random variability. The lowest total uncertainties and the most favorable uncertainty ratios were consistently obtained at 4× magnification, which therefore represents the optimal compromise between resolution, field of view, and measurement stability.
For the 1.1 mm gauge block, an exception was observed at 4× magnification, where the type A component temporarily increased to ≈37% of the total uncertainty, indicating a short-term instability or focus sensitivity at this zoom level. Nevertheless, even in this case, the expanded uncertainty remained well below the microscope’s declared ±MPE limits.
When the expanded uncertainties U were compared directly with the MPE limits for all magnifications, all measured values and uncertainty intervals remained comfortably within the tolerance specified by the manufacturer (±6 µm at 1×–2× and ±4 µm at 3×–5×). This confirms that the microscope meets its declared accuracy across the entire magnification range and that the calibration procedure used in this study provides traceable and reliable results.
The observed dependence of uncertainty on magnification is consistent with findings reported by Zhang et al. [
10], Li et al. [
11], and Leach [
17], who emphasized that optical scaling and distortion effects vary with zoom and can introduce non-linear calibration errors. The decreasing uncertainty trend with higher magnification also aligns with the results of Gao et al. [
1] and Bellantone et al. [
2], who demonstrated that improved spatial sampling density enhances precision in optical dimensional metrology.
The magnitude of the combined uncertainty determined in this work (2.6–3.8 µm) corresponds well to typical uncertainty ranges reported in comparable optical measurement studies [
3,
15,
31,
46]. The predominance of the instrument’s MPE as the major uncertainty source agrees with analyses by Brown et al. [
24] and De Chiffre et al. [
37], who concluded that internal optical calibration and alignment stability dominate the uncertainty budget of digital microscopes.
Furthermore, the successful application of the bootstrap method supports the approach used by Hooshmand et al. [
3] and Batista et al. [
8], who recommended non-parametric resampling as an effective alternative when data deviates from strict normality. The overall results therefore validate that the applied methodology—combining non-parametric testing, bootstrap analysis, and classical GUM formulation—provides a reliable framework for evaluating magnification-related uncertainty in image-based measurement systems.
From a practical viewpoint, the findings highlight that selecting the appropriate magnification is crucial for achieving optimal measurement accuracy in digital microscopes. Although higher magnifications generally reduce total uncertainty, they also reduce the field of view and increase sensitivity to mechanical vibration, illumination, and focus drift. The results indicate that magnifications around 4× provide the best balance between measurement precision and operational robustness.
For industrial and laboratory users, it is therefore recommended that calibration and verification procedures be carried out at each magnification level, especially at the lowest settings where scale non-linearity is most pronounced. Implementing internal calibration checks or automated correction algorithms could further compensate for magnification-dependent deviations.
Future research should focus on extending the presented methodology to microscopes with constant MPE design, investigating temperature and focusing effects, and developing real-time uncertainty estimation integrated into image-processing software. A promising direction also lies in correlating the magnification-dependent uncertainty with optical aberration modeling and machine-learning-based calibration correction.
Future research may also investigate the influence of the measurement position within the microscope’s field of view at different magnifications. Optical distortion, illumination non-uniformity, and sensor-related effects are often field-dependent and may vary with magnification, potentially affecting edge localization and measurement uncertainty. Systematic evaluation of spatial repeatability across the field of view would provide further insight into the robustness and general applicability of optical dimensional measurements under varying magnification conditions [
44,
53,
73].
Overall, the present study confirms that magnification plays a decisive role in determining the measurement accuracy and uncertainty of digital optical microscopes. By systematically analyzing statistical behavior, uncertainty components, and calibration relationships, the research provides a quantitative foundation for optimizing magnification settings and ensuring traceability in optical dimensional metrology.
The present study was carried out using a single commercial digital microscope (Insize ISM-DL520), which reflects the instrumentation available in our laboratory. Although this constitutes a limitation, the five selected magnification settings included two pairs with identical manufacturer-specified MPE values. These pairs showed very similar measurement-uncertainty behavior, indicating that systems with constant MPE may exhibit comparable trends if their optical design and calibration conditions are similar. Previous studies have demonstrated that magnification-dependent errors, pixel-scale calibration, and optical aberrations are dominant contributors to measurement uncertainty regardless of instrument model, provided that the MPE remains constant across magnification settings [
41,
42,
43,
44]. For these reasons, we expect the qualitative conclusions of this study to be transferable to digital microscopes with constant MPE, although broader verification across different optical architectures should be pursued in future work.
While the present study focuses on gauge blocks in the range of 1.0–1.5 mm, different behavior can be expected for substantially smaller or larger features. For sub-millimeter dimensions, optical diffraction limits, reduced edge contrast, and increased sensitivity to pixel-scale calibration tend to dominate the uncertainty, and specialized instrumentation such as SEM, AFM, or confocal microscopy may be required [
44,
74]. Conversely, for larger features, optical microscopes are often replaced by interferometric, structured-light, or tactile metrology systems, where magnification no longer plays a key role in the uncertainty budget [
43]. Therefore, the proposed methodology is most applicable for dimensional features ranging from several hundred micrometers to a few millimeters, and its extension outside this range should be supported by additional verification.