1. Introduction
Advanced intelligent manufacturing places stringent demands on measurement technologies, including ultra-high precision, non-contact operation, high throughput, and real-time adaptability. Optical interferometry, with its unique advantages, has become a key technology for meeting these requirements [
1]. For instance, accurate radius of curvature (ROC) measurement plays a crucial role in the inspection of optical coated substrates, lens manufacturing and semiconductor wafer processing. In an optical interferometer system, concentric circular fringes are generated when coherent light is reflected from surfaces with slightly different curvatures. The shape (linear, elliptical, or circular) of the interference fringes produced by optical interferometers provides important information about the interference wavefront. It is widely used for direct visual inspection in optical workshop testing, as well as other applications involving important scientific and engineering measurements [
2]. The fringe spacing and distribution directly encode surface curvature information. Traditional analytical approaches rely on fringe counting, phase unwrapping, or Fourier-based frequency analysis [
3]. While these methods are theoretically robust, they often require manual calibration, careful noise suppression, and precise alignment. Moreover, fringe patterns are influenced by substrate material properties such as refractive index, reflectivity, and surface roughness, leading to variations that complicate conventional algorithms.
The traditional Newton’s rings method is a classic and widely utilized technique for evaluating physical parameters in optical technology [
4]. Its core utility lies in deriving critical values, such as ROC, from the characteristics of interference fringes. Conventional analytical approaches, including the least squares method (LSM), centering and profiling algorithms (CPAs) [
5], and fast Fourier transform (FFT) analysis, have been established to estimate these physical parameters. While these methods have achieved a degree of success in terms of precision, they are often hindered by significant limitations, such as prolonged processing times, insufficient robustness against noise, and a high sensitivity to initial parameter settings. In recent years, deep learning (DL) has become a powerful tool for solving problems through data-driven learning [
6]. Feng et al. [
7] demonstrated how deep learning can significantly improve the accuracy of phase demodulation from single-stripe patterns. Compared with existing single-frame methods, this deep learning-based technique provides a framework for stripe analysis by rapidly predicting the background image and estimating the numerator and denominator of the arctangent function, thereby achieving high-precision, edge-preserving phase reconstruction without any human intervention. With the rapid advancement of deep learning, convolutional neural networks (CNNs) have demonstrated exceptional capabilities in image recognition and parameter estimation [
8]. Recent studies indicate that CNN architectures, such as the Visual Geometry Group (VGG) [
9] and Residual Neural Network (ResNet) [
10], can directly analyze Newton’s rings images to estimate centers and radii of curvature simultaneously with high precision [
11]. These models exhibit superior noise resistance and lower computational latency compared to traditional algorithms, offering a more practical framework for fringe analysis [
12].
According to the research of Wu et al. [
13], the measurement of the ROC in Newton’s rings images has traditionally been analyzed using methods such as LSM, CPA, or FFT. These techniques mainly involve numerical fitting of the center and radius of the interference fringes, or extracting the fringe frequency information through frequency domain analysis, and then deducing the physical parameters related to curvature [
14]. Li et al. [
12] studied the application of CNN to the task of predicting the physical parameters of Newton’s rings interference fringes. This study developed an end-to-end learning model based on the fusion of VGG structure and U-Net architecture, which can simultaneously complete the localization of the fringe center and the regression prediction of the ROC. The researchers trained and tested the model through a large number of simulated and real interference images to verify that the model can still maintain extremely high prediction accuracy under various common noise conditions (including −5 dB Gaussian noise and 60% salt and pepper noise). Furthermore, Zhang et al. [
15] proposed a practical-oriented CNN architecture for prediction optimization of Newton’s rings images, and introduced a lightweight convolutional module and a multi-scale feature fusion strategy to balance model inference speed and anti-interference capability.
In this work, we present a deep learning-based approach for direct prediction of curvature radius from concentric circular interference fringe images. The proposed method demonstrates improved accuracy, strong noise tolerance, and significantly reduced computational latency compared with traditional methods. The main objective of this study is to bridge this gap by developing a fast prediction method for analyzing circular interference fringes. We combine an improved Twyman–Green interferometer with deep learning models (CNN, VGG-16, and ResNet-18) and utilize a self-developed MATLAB analysis environment to propose a non-destructive and rapid measurement system. Unlike traditional methods that rely on manual interpretation of fast Fourier transform (FFT) [
16,
17,
18,
19], this study automatically evaluates the ROC of the flat substrates. A commonly used method in the past was the curvature sensing technique, which estimates wavefront shape by solving the transport-of-intensity equation (TIE) from defocused intensity measurements [
20]. The transport-of-intensity equation and the classical Hartmann or Shack–Hartmann sensor reconstruct the wavefront from local slope measurements using a lenslet array [
21]. Both approaches rely on explicit physical models and multi-step numerical reconstruction. In contrast, the proposed method leverages deep learning models to directly infer the ROC from interferometric fringe patterns, bypassing traditional procedures such as fast Fourier transform (FFT)-based phase extraction and iterative phase unwrapping [
16,
17]. As a result, it preserves the high accuracy and spatial resolution of interferometry while significantly improving measurement speed. Therefore, we propose a semi-automatic optical measurement system that combines fully automated deep learning inference to predict the ROC. Compared to curvature sensing, it avoids computationally intensive differential equation solving, and compared to the Shack–Hartmann sensor, it achieves finer spatial detail, making it particularly suitable for rapid and high-precision optical inspection.
2. Materials and Methods
2.1. Optical Measurement Setup
The optical path diagram of the substrate surface measurements is shown in
Figure 1. It illustrates the modified Twyman–Green interferometer used to measure fringes of equal thickness. The setup is described as follows: A Helium–Neon Laser (wavelength λ = 632.8 nm) serves as the light source. It possesses excellent monochromaticity and a long coherence length, enabling the production of stable interference fringes. A laser beam passes through a spatial filter to remove stray modes, resulting in a nearly uniform parallel beam. An iris aperture further controls the diameter of the laser beam transmission. The beam passes through a beam splitter and is divided into two optical paths. One path travels to the reference mirror, forming the reference beam; the other path is incident upon the test sample and reflects back to form the test beam. Finally, the beams from both arms are recombined at the receiving end on an imaging lens. A CCD camera captures the interference image, displaying either circular or straight fringes, facilitating subsequent computer-aided analysis.
2.2. Coated Substrates
In this study, two types of substrates with different materials and thicknesses, namely, B270 glass substrates and sapphire substrates, were selected to measure the ROC. The Young’s modulus, Poisson’s ratio, and thickness of the substrates were compared, taking into account their representative mechanical performance differences.
Figure 2 shows the B270 glass substrate used in the experiment along with the equal-thickness circular interference fringes corresponding to a large ROC. The B270 glass substrate has a thickness of 1.6 mm and a diameter of 25.2 mm, providing moderate mechanical stiffness. Its Young’s modulus and Poisson’s ratio are 71.5 GPa and 0.22, respectively. Owing to its stable mechanical properties, good surface flatness, and excellent optical quality, the glass substrate is well-suited for precise observation of interference fringes and curvature variations.
Figure 3 presents the sapphire substrate, which has a thickness of only 0.44 mm and a diameter of 50.4 mm. Sapphire exhibits a very high Young’s modulus of 335 GPa and a Poisson’s ratio of 0.25, offering excellent rigidity and thermal stability. As a result, it is widely used in optoelectronic devices and is particularly suitable for high-precision thin-film stress analysis. Prior to deposition, both substrates were subjected to rigorous cleaning and polishing processes to ensure uniform film deposition and to eliminate measurement interference caused by surface contamination. From the observed interference images, the glass substrate exhibits fewer equal-thickness interference fringes, indicating a larger ROC and a smaller overall bending deformation after deposition. In contrast, although the sapphire substrate has a higher Young’s modulus, its thickness is approximately one quarter that of the glass substrate. Consequently, its overall flexural rigidity is lower, leading to a significant increase in the number of interference fringes and a relatively smaller ROC, which indicates more pronounced bending deformation. This phenomenon is consistent with classical material mechanics theory, as the flexural rigidity of a plate is proportional to the square of its thickness. As a result, the glass substrate is able to maintain better flatness after thin film deposition, whereas the sapphire substrate is more susceptible to curvature variation. These findings also validate the sensitivity and accuracy of the optical measurement platform developed in this study.
2.3. Interference Fringe Measurements
To comprehensively evaluate the residual stress of thin films after the manufacturing process, this study utilizes and cross-verifies three complementary optical analysis techniques. These include circular fringe analysis of equal thickness, fast Fourier transform (FFT) analysis of straight fringes of equal thickness, and a deep learning (DL) approach trained on the ROC (R) derived from circular fringes to predict and validate R values.
Method 1: Circular fringe analysis. The first method utilizes a modified Twyman–Green interferometer coupled with a CCD camera to capture circular interference fringes (similar to Newton’s rings) of the sample. Through edge detection technology, the radii () of various interference orders are automatically measured. Based on the interference conditions for bright or dark fringes, a linear relationship between the order m and is established to calculate the overall radius of curvature R.
Method 2: Straight-line fringe FFT analysis. The second method also employs the modified Twyman–Green interferometer to obtain straight-line interference fringes of equal thickness. Following grayscale conversion, denoising, and background subtraction, multiple interference fringes are selected for FFT analysis, as shown in
Figure 4. By selecting the sideband carrier from the frequency spectrum, the phase image is reconstructed and fitted to determine the substrate’s curvature radius
R.
Method 3: Deep learning prediction. The third method involves using the R values obtained from circular fringes to train various deep learning models. This process identifies the optimal predictive model, which is then used to generate predicted R values. Finally, the results from all three methods are compared and validated.
2.4. Radius of Curvature Determination Using Different Interference Fringes
This study developed a MATLAB-based program designed for the rapid and precise calculation of the ROC from circular interference fringes of coated substrates captured via a modified Twyman–Green interferometer. To optimize fringe clarity, the program initially applies Gaussian filtering and brightness enhancement to the captured images. Following the manual selection of the geometric center by the user through a dedicated MATLAB interface, the image undergoes a polar coordinate transformation. The physical radii (
) of each fringe order are then extracted by analyzing the one-dimensional intensity distribution curve using the findpeaks function. In accordance with Newton’s rings theory [
22], the calculation of
R is achieved by a linear regression fitting of the relationship between the fringe order (
m) and
. The radius of the
mth dark ring is
rm. The formula can be expressed as:
where λ is the wavelength of light source.
R is the radius of curvature.
m is the fringe order (
m = 0, 1, 2, 3, …). This means that the radius of the ring is proportional to the square root of a natural number multiplied by the λR. Thus, the rings becom_e close to each other as the radius increases.
Fringe spacing (Δr) is inversely related to the curvature radius. This means that the higher the fringe density, the smaller the ROC. It can be expressed as:
The spatial frequency (
f) of fringes is inversely proportional to the square root of the ROC, and the relationship is as follows:
Therefore, the physical encoding method of the ROC is fringe spacing, fringe density, and radial frequency distribution. Specifically, the program identifies bright fringe centers as = R and dark fringe centers as = mR. If identifiable fringes are insufficient, a single-fringe formula is utilized for estimation. The final R value is derived from the regression slope, with the root mean square error (RMSE) calculated to serve as a critical indicator of data quality and fitting accuracy, ensuring the reliability of the automated verification process.
The accuracy and reliability of the center point selection and calculated results are validated using the root mean square error (RMSE). To streamline the workflow for subsequent ROC analysis and deep learning research, the program automatically compiles key metadata—including filenames, radii of curvature, and fitting errors—into an Excel report, providing a robust dataset for further investigation. For the calculation of the ROC using straight interference fringes of equal thickness, images acquired via the semi-automatic Twyman–Green interferometer and a CCD camera are first subjected to grayscale conversion and double-precision transformation, followed by normalization to the range [0, 1]. Users specify a center and radius to isolate the effective interference region, effectively excluding edge noise and background artifacts. A fast Fourier transform (FFT) is then applied to the preprocessed image to obtain the amplitude spectrum |
F(
u,
v)|. The frequency spectrum obtained by fast Fourier transform of the straight-line interference fringes is illustrated in
Figure 4. By manually selecting the sideband carrier center, the frequency region is extracted and filtered using a cosine window to enhance signal quality. The extracted frequency data are shifted to the spectral center to eliminate the direct current (DC) component before performing an inverse fast Fourier transform (IFFT). Phase information is then retrieved from the resulting complex data and converted into an optical surface profile. To obtain a continuous phase map, a phase unwrapping method is employed to resolve 2
phase jumps. Based on interference conditions, these data are converted into optical thickness variations, and residual tilt is removed through plane fitting. The central region is subsequently used for the spherical surface reconstruction and curvature radius fitting. Furthermore, one-dimensional curve fitting is performed on the
X-axis and
Y-axis phase profiles to determine the directional radii of curvature,
and
. Finally, the average of the ROC is calculated.
2.5. Selection of Deep Learning Models
This study evaluates and compares three deep learning models—CNN, ResNet-18, and VGG-16—to determine which architecture offers the highest accuracy in predicting the ROC from interference images. The implementation follows a comprehensive five-stage workflow: (1) data labeling and grouping, (2) preprocessing and augmentation, (3) model construction, (4) training and validation, and (5) testing, evaluation, and result export. Initially, a Python-based workflow using Pandas is employed to read curvature labels from Excel files, mapping each circular interference fringe image to its corresponding numerical
R value. The dataset is categorized into two groups based on the radius size: “Small ROC” (20–60 m), as shown in
Figure 5a, and “Large ROC” (over 100 m), as shown in
Figure 5b. Each group consists of 500 samples. To examine the influence of substrate material on model performance, B270 glass substrates are assigned to the large-radius group, while sapphire substrates constitute the small-radius group. During the data loading phase, several augmentation and normalization techniques are applied to enhance model robustness. Images are rescaled and center-cropped to 224 × 224 pixels, followed by random horizontal flipping, random rotation (±10°), and color jittering. The data is then converted into tensors and normalized using the mean and standard deviation of the ImageNet dataset. The total pool of 1000 images is split into training, validation, and testing sets using a 70%/15%/15% ratio. A batch size of 32 is utilized for iterative parameter updates. Since this experiment treats curvature prediction as a regression task, the standard classification heads of the backbone networks (CNN, VGG-16, or ResNet-18) are replaced with regression heads. The architecture extracts high-level features from the interference fringes, compressing the feature map from
C ×
H ×
W to
C × 1 × 1 via global pooling, which is then flattened into a feature vector
z of length
C. A final linear fully connected layer with an output dimension of 1 is appended, where the prediction is calculated as
y =
Wz +
b. The training process minimizes either MSE (Mean Squared Error) or Huber loss to reduce the discrepancy between the predicted value and the measured value
y obtained from the circular fringe analysis. This regression design ensures the output of a single continuous numerical value for the
ROC.
In terms of model architecture, this study first implements a custom CNN consisting of three sequential blocks. Each block includes a 3 × 3 convolutional layer, a ReLU activation function, and Max Pooling, with filter counts set at 32, 64, and 128, respectively. Following feature extraction, the data is flattened and passed through a fully connected (FC) layer with 256 nodes, utilizing ReLU activation and a 50% dropout rate to mitigate overfitting, before finally outputting to a single node for the prediction. To provide a comprehensive comparison, two pre-trained models—ResNet-18 and VGG-16—are also evaluated. For ResNet-18, weights pre-trained on ImageNet are loaded, preserving the backbone structure while replacing the final classification layer with a single-neuron output. Similarly, the VGG-16 model has its original classification head removed and replaced by a sequence of two FC layers: 4096→ReLU→4096→ReLU→1. All computations are assigned to a GPU to maximize processing efficiency.
The training phase for all three models employs Huber loss (with a tolerance threshold δ = 0.7) and the Adam optimizer (Learning Rate = 2 × 10
−5, Weight Decay = 1 × 10
−4). A ReduceLROnPlateau [
23] scheduler monitors the validation loss; if the loss does not improve for 15 consecutive epochs, the learning rate is automatically adjusted by a factor of 0.5. The training is conducted for 1000 epochs, during which the model with the lowest validation loss is automatically saved.
Upon completion, the best-performing models are evaluated on the test set using RMSE and MAE metrics. The predictive performance is visualized through scatter plots of “Experimental Measurement vs. Deep Learning Prediction” and line graphs of “Training/Validation Loss.” Finally, the test results, loss data, and predictions for unknown circular interference images are exported to an Excel file for further analysis. This deep learning-based framework, as illustrated in
Figure 6, successfully automates the derivation of the ROC directly from raw interference fringe images.
2.5.1. CNN Model Architecture
This study introduces three model designs in sequence, starting with the CNN architecture, as illustrated in
Figure 7. From left to right, the input layer accepts standardized circular interference fringe images with a resolution of 224 × 224 pixels in a single-channel (grayscale) format. The initial processing stage, Conv1, utilizes thirty-two 3 × 3 convolutional kernels with a padding of 1 to maintain spatial dimensions. This is followed by Batch Normalization (Batch Norm) to stabilize training, a ReLU activation function to enhance non-linear representation, and Max Pooling for spatial downsampling, resulting in an output dimension of 112 × 112 × 32. The second stage, Stage 2, repeats this convolutional sequence but increases the filter count to 64, with the output size reduced to 28 × 28 × 128 after pooling. To extract deeper features, the Stage 3 module also employs 3 × 3 kernels with 128 filters. Unlike the preceding stages, Stage 3 utilizes Adaptive Average Pooling instead of Max Pooling to compress the feature maps into a 1 × 1 dimension (resulting in an output dimension of 1 × 1 × 256), which effectively fixes the input size for the subsequent layers. The fourth stage is a dropout layer (
p = 0.5), which randomly omits half of the neurons during training to prevent overfitting and improve the model’s generalization capabilities. Finally, the 256-dimensional feature vector is flattened and passed through a fully connected (FC) layer, which outputs the final predicted ROC as a continuous numerical value. In summary, this four-layer convolutional network integrates Batch Norm and ReLU after each convolution to ensure stable learning. To visualize the training progression and predictive performance, this study utilizes the Matplotlib R2024b library to generate training/validation loss curves and scatter plots comparing experimental measurements with predicted values. These visualizations provide an intuitive assessment of the model’s accuracy and training state, serving as a robust foundation for subsequent analysis and discussion.
2.5.2. ResNet-18 Architecture
Based on the ResNet-18 architecture, this study developed a deep learning model designed to automatically predict the ROC (
R) of thin-film surfaces from circular interference fringe images through a streamlined six-module process, as illustrated in
Figure 8. The architecture initiates feature extraction with a Conv1 layer utilizing a 7 × 7 convolutional kernel with a stride of 2 and Max Pooling, effectively reducing the input 224 × 224 images to 112 × 112. The number of channels is 64 and a feature map is used to isolate low-level features such as edges and brightness distribution. The core of the network is structured into four sequential stages, each containing two Basic Blocks that employ Residual Connections to mitigate the common issues of gradient vanishing and degradation in deep networks. Stage 1 utilizes 3 × 3 convolutions and Identity Shortcuts to produce a 56 × 56 × 64 feature map that captures the directional and structural characteristics of the fringes, while Stage 2 and Stage 3 transition to Projection Shortcuts (1 × 1) to enhance the identification of fringe widths and deepen feature abstraction, resulting in output dimensions of 28 × 28 × 128 and 14 × 14 × 256, respectively. The final feature extraction in Stage 4 generates high-level semantic features with an output dimension of 7 × 7 × 512 through channel alignment. Although the original ResNet-18 was optimized for 1000-class classification, this study adapts the framework for a regression task by replacing the final fully connected (FC) layer with a single-neuron output layer. Following the training phase, the model is evaluated on a test set using RMSE and MAE metrics, with performance further visualized through scatter plots of experimental measurements versus predicted values and line graphs representing training and validation loss. This integrated methodology culminates in an automated measurement workflow that applies the best-performing model to unknown image data and automatically exports all results and loss curves into separate Excel sheets for efficient data management.
2.5.3. VGG-16 Model
The VGG-16 model is utilized as the deep learning model for the curvature radius prediction task to evaluate the performance of deep convolutional architectures in image analysis, with the architectural diagram shown in
Figure 9. The input layer receives RGB images with a resolution of 224 × 224 × 3, and for grayscale images, the single channel is replicated three times to meet the standard input format requirements. Conv Block 1 comprises two 3 × 3 convolutional layers with 64 channels, followed by a MaxPool2d layer for downsampling to an output size of 112 × 112 × 64, primarily responsible for extracting low-level features such as edges and intensity distributions. Conv Block 2 similarly consists of two 3 × 3 convolutional layers with the channel count expanded to 128, paired with a Max Pooling layer to produce an output size of 56 × 56 × 128, further capturing mid-level features like fringe orientation and center distribution. Conv Block 3 includes three convolutional operations with the channel count increased to 256, along with a MaxPool2d layer, resulting in an output dimension of 28 × 28 × 256, enabling the module to identify variations in the amplitude and distribution of interference rings. Conv Block 4 contains three convolutional layers with 512 channels and, after Max Pooling, output size reaches 14 × 14 × 512, possessing high abstraction capabilities to integrate different fringe ranges and bending patterns. Conv Block 5 also consists of three convolutional layers and one MaxPool2d layer, with the output size reduced to 7 × 7 × 512 to form the final feature map. For the regression task, the 7 × 7 × 512 feature map is flattened into a one-dimensional vector, followed by two 4096-dimensional FC layers and a single-output FC layer (FC-1) for curvature radius prediction; although the original VGG-16 was designed for a 1000-class classification task, this study modifies the output layer from FC-1000 to FC-1 to address the regression problem. To ensure accuracy, the classification output layer of the original model is replaced with a single FC layer, enabling the model to output continuous values corresponding to the thin film’s curvature radius while retaining pre-trained weights to preserve high-level feature extraction capabilities. Finally, the model’s input resolution remains 224 × 224, and preprocessing steps, including random rotation and color jittering, are applied to enhance generalization performance.
4. Conclusions
This study successfully developed a semi-automated and highly efficient measurement system for curvature radius by integrating circular interference fringes and deep learning models. Among the three models compared, ResNet-18 demonstrated the most outstanding overall performance in curvature radius prediction. For small-curvature-radius samples (sapphire substrates), ResNet-18 achieved the lowest error metrics with an MAE of 1.66 m, RMSE of 2.43 m, and MAPE of 5.44%, indicating high precision in learning subtle fringe variations. In the large-curvature-radius samples (glass substrates), ResNet-18 yielded an MAE of 5.02 m, RMSE of 6.55 m, and a MAPE of only 3.40%, significantly outperforming both the CNN and VGG-16 models. ResNet-18 not only balances accuracy and stability but also exhibits superior generalization across different curvature ranges, making it the most promising predictive model in this work.
Furthermore, the accuracy of the circular interference fringe prediction via deep learning models was validated against the conventional FFT method using straight interference fringes. Across five sets of samples, the relative error remained below 1.7%, confirming that the results of both methods are highly consistent, accurate, and reproducible. This shows that the model can precisely perform regression for curvature radius and be effectively applied to physical property estimation as an alternative tool for optical measurement. Finally, a graphical user interface (GUI) was developed to integrate image uploading, model inference, and ROC calculation. Operating in a GPU environment, the inference time per image is only 5–20 ms, improving overall measurement efficiency by over 90% while significantly reducing human error and time costs. In conclusion, the results of this study not only scientifically validate the feasibility of applying deep learning to interference image analysis but also provide an innovative, accurate, and practical solution for the non-destructive testing of coated substrates.