Measurement of Film Structure Using Time-Frequency-Domain Fitting and White-Light Scanning Interferometry

: A new technique is proposed for measuring ﬁlm structure based on the combination of time- and frequency-domain ﬁtting and white-light scanning interferometry. The approach requires only single scanning and employs a ﬁtting method to obtain the ﬁlm thickness and the upper surface height in the frequency and time domains, respectively. The cross-correlation function is applied to obtain the initial value of the upper surface height, thereby making the ﬁtting process more accurate. Standard ﬁlms (SiO 2 ) with different thicknesses were measured to verify the accuracy and reliability of the proposed method, and the three-dimensional topographies of the upper and lower surfaces of the ﬁlms were reconstructed.


Introduction
Fabrication processes in the semiconductor and display industries invariably involve continuous deposition and etching to integrate thin films on silicon substrates or transparent glass, and measuring the film thickness and topography is essential for increasing yields and monitoring product quality. Spectroscopic reflectometers are used widely to measure the thickness and the refractive index of thin films by observing the intensity of reflected light [1][2][3], and ellipsometry is well-known in thin-film measurement for its high precision and accuracy [4][5][6][7]. However, those methods give only the film thickness, whereas it is also necessary to obtain the upper or lower surface height when measuring the topography. The film thickness and height can be estimated from the phase information of a white-light spectral interferometer (WSI) [8][9][10], but WSIs cannot be applied to full-field measurement and have low lateral resolution.
Instead, the surface topography can be obtained using white-light vertical scanning interferometers (VSIs) [11][12][13]. Limited by the coherence length of the light source, traditional VSI analysis methods such as coherent peak detection (CPD) algorithms can only obtain the thicknesses of relatively thick films [14]. For measuring thin films, frequency-domain analysis is usually required. The phase analysis in the frequency domain determines the film thickness through the nonlinear phase caused by the film [15,16]. This method is not reliable when the thickness is less than 100 nm, and the approach relies on the accuracy of the initial value of the fitting and the removal of the nonlinear phase of the measurement system. de Groot and de Lega proposed a VSI signal model [17], and several researchers have used it to obtain the film reflectivity and thickness from Fourier-transform information [18][19][20]. In those methods, the tested sample is compared with a reference sample to solve the film-thickness issue, but such approaches usually require the tested and reference samples to be scanned separately and the light-source intensity and experimental parameters to remain consistent in the two measurements, thereby setting higher requirements for the experimental equipment and operation [21][22][23][24].
Herein, we propose a method for measuring films in the full field of view based on single VSI scanning. The method exploits the phenomenon that the VSI signals of different film thicknesses have different amplitude shapes in the frequency domain; the thickness of the film is measured via frequency-domain amplitude fitting, and the height of the upper surface of the film is measured through the time-domain fitting of the VSI signal. In the proposed method, the tested sample requires only one vertical scan to measure the film thickness and the height of the upper surface, whereupon the three-dimensional (3D) topographies of the upper and lower surfaces of the film are reconstructed.

System Structure
The experimental configuration is presented in Figure 1. A Michelson-type VSI was used as the optical system. A 5× interference objective lens (CF EPI Plan TI 5×; Nikon, Tokyo, Japan) and a 1× tube lens were used to form an image on the camera, where the lateral resolution was 2.1 µm. A coaxial illumination beam from a halogen lamp was used as the light source, and the camera (acA1300-200 um CMOS; Basler, Ahrensburg, Germany) featured 1280 × 1024 pixels with a pixel size of 4.8 µm × 4.8 µm. A piezoelectric transducer (PZT) afforded precise scanning in the Z direction and featured a high-precision objective scanner (P-721; Physik Instrumente, Karsruhe, Germany). The PZT used a capacitive sensor with a stroke of 100 µm in the closed-loop mode and had a resolution of 0.7 nm, a linear error of 0.03 nm, and a repeatability of ±5 nm. The system was placed on an active vibration-isolation table to reduce the influence of external vibrations.
Machines 2021, 9, x FOR PEER REVIEW 2 of 12 measurements, thereby setting higher requirements for the experimental equipment and operation [21][22][23][24]. Herein, we propose a method for measuring films in the full field of view based on single VSI scanning. The method exploits the phenomenon that the VSI signals of different film thicknesses have different amplitude shapes in the frequency domain; the thickness of the film is measured via frequency-domain amplitude fitting, and the height of the upper surface of the film is measured through the time-domain fitting of the VSI signal. In the proposed method, the tested sample requires only one vertical scan to measure the film thickness and the height of the upper surface, whereupon the three-dimensional (3D) topographies of the upper and lower surfaces of the film are reconstructed.

System Structure
The experimental configuration is presented in Figure 1. A Michelson-type VSI was used as the optical system. A 5× interference objective lens (CF EPI Plan TI 5×; Nikon, Tokyo, Japan) and a 1× tube lens were used to form an image on the camera, where the lateral resolution was 2.1 μm. A coaxial illumination beam from a halogen lamp was used as the light source, and the camera (acA1300-200 um CMOS; Basler, Ahrensburg, Germany) featured 1280 × 1024 pixels with a pixel size of 4.8 μm × 4.8 μm. A piezoelectric transducer (PZT) afforded precise scanning in the Z direction and featured a highprecision objective scanner (P-721; Physik Instrumente, Karsruhe, Germany). The PZT used a capacitive sensor with a stroke of 100 μm in the closed-loop mode and had a resolution of 0.7 nm, a linear error of 0.03 nm, and a repeatability of ±5 nm. The system was placed on an active vibration-isolation table to reduce the influence of external vibrations.

VSI Modeling for Film Samples
According to the interference theory, the intensity of the interfering light can be expressed as: where and are the light intensities of the reference and sample light beams, respectively, and is the phase difference between the two beams.
The light intensity and phase of the reflected light will change, when the sample is a thin film. The Fresnel equations are used to describe this model [25]. The total reflection coefficient R can be expressed as:

VSI Modeling for Film Samples
According to the interference theory, the intensity of the interfering light can be expressed as: where I r and I s are the light intensities of the reference and sample light beams, respectively, and δ is the phase difference between the two beams. The light intensity and phase of the reflected light will change, when the sample is a thin film. The Fresnel equations are used to describe this model [25]. The total reflection coefficient R can be expressed as: where r 12 and r 23 are the Fresnel reflection coefficients of the upper and lower surfaces of the film, respectively, d is the film thickness, N 2 is the complex refractive index of the film, and θ is the incidence angle. The square of the modulus of the total reflection coefficient R obj = |R| 2 represents the ratio of the intensity of the reflected light to the intensity of the incident light, and the phase ∠R represents the phase caused by the film. If the ratio of the light intensity passing through the beam splitter is m r : m s and m r + m s = 1 and if the reflectivity of the reference mirror is R re f , then the influence of the numerical aperture of the objective lens may be ignored. If the intensity of the light reflected by the reference mirror is I re f (k), then by combining Equations (1) and (2), the VSI signal of a film with wavenumber k = 1 λ can be expressed as: where z is the scanning position, and h is the height of the upper surface. The position of the reference mirror in Figure 1 is the zero point of the scanning position, and the scanning direction from bottom to top is defined as the positive direction. When measuring films thinner than 100 nm, the parameter c(k) must be calibrated to ensure more accurate results; we discuss this in Section 3.2.
When the signal is collected by a digital camera with a spectral response of G(k), the interference signal of white light is the integral of the interference signal between wavenumbers k 1 and k 2 and can be written as:

Time-and Frequency-Domain Fitting
In this section, we will combine the VSI signal model of the film sample to describe a method of film topography reconstruction based on the time-and frequency-domain fitting. The time domain here is the domain of the original VSI signal, and the frequency domain is the domain of the VSI signal after Fourier transform.
In this method, the film thickness d is calculated by frequency-domain fitting, and the height of the upper surface of the film h is obtained by time-domain fitting. The height of the lower surface can be calculated by subtracting the thickness of the film d from the height of the upper surface h. The topography of the film is reconstructed by this process. The flowchart of the proposed method is shown in Figure 2. The time-and frequency-domain fitting and simulation will be described in detail below.

Principle of the Frequency-Domain Fitting
Equations (3) and (4) show that the interference signal for each wavenumber is a DC signal ( ) ( )(1 + ( ) ⋅ ) plus a cosine signal with an amplitude of The time-and frequency-domain fitting and simulation will be described in detail below.  (3) and (4) show that the interference signal for each wavenumber k is a DC signal G(k)I re f (k) 1 + c (k) · R obj plus a cosine signal with an amplitude of 2G(k)I re f (k) c(k) · R obj . If the interference signal I(z) is Fourier-transformed, then the time-domain signal is transformed into the frequency domain and is shown as: The amplitude of the frequency domain can be regarded as the intensity of the light source modulated by the film. Therefore, the film-thickness information can be obtained through the amplitude information. Normalizing the Fourier-transform amplitude of the visible-light band can make this method insensitive to changes in the overall intensity of the spectrum of the light source, as long as the spectral shape remains constant.
The frequency-domain analysis of the interference signal was illustrated by simulation. In Figure 3a, Equation (4) was used to simulate the VSI signal of a thin SiO 2 film deposited on a silicon substrate with different film thicknesses of 100, 300, 400, 500, 1000, and 1500 nm, and the scanning step was 30 nm. The white-light source was set as a Gaussian light source (full width at half maximum, 140 nm; central wavelength, 580 nm). Assuming that the light intensity ratio of the two beams which the white light source is split into by a beam splitter is 1:1, for the visible light, the reflectivity of the reference mirror can be regarded as 1. Figure 3b shows that the normalized amplitudes differed obviously with different film thicknesses.  Therefore, the Fourier-transform-normalized amplitude ( ) of the measurement data can be compared with the theoretical Fourier-transform-normalized amplitude ( , ) by minimizing the parameter ( ), which can be expressed as: Given that this method is a Fourier-transform-based method, for discrete signals, the accuracy of the scanning step has an impact on the measurement results, especially for extremely thin films. The influences of the step error of the scanner on the measurements of films with different thicknesses were simulated. The simulation generated signals of five different thicknesses for SiO2, i.e., 1000, 300, 100, 80, and 50 nm with a scanning step of 30 nm. Table 1 lists the mean values and standard deviations of the film thicknesses for 10 measurements after adding Gaussian noise (zero mean and standard deviations of 1 and 5 nm) to the scanning step. Table 1 shows that when the film thickness was less than 100 nm, the error of the scanning step had a greater impact on the measurement results of Therefore, the Fourier-transform-normalized amplitude S m (k) of the measurement data can be compared with the theoretical Fourier-transform-normalized amplitude S t (k, d) by minimizing the parameter e(d), which can be expressed as: Given that this method is a Fourier-transform-based method, for discrete signals, the accuracy of the scanning step has an impact on the measurement results, especially for extremely thin films. The influences of the step error of the scanner on the measurements of films with different thicknesses were simulated. The simulation generated signals of five different thicknesses for SiO 2 , i.e., 1000, 300, 100, 80, and 50 nm with a scanning step of 30 nm. Table 1 lists the mean values and standard deviations of the film thicknesses for 10 measurements after adding Gaussian noise (zero mean and standard deviations of 1 and 5 nm) to the scanning step. Table 1 shows that when the film thickness was less than 100 nm, the error of the scanning step had a greater impact on the measurement results of the film thickness. Therefore, when measuring extremely thin films, this method requires a highly accurate scanning step.

Principle of the Time-Domain Fitting
After determining the film thickness d, the height of the upper surface h is determined to reconstruct the surface topography of the film. Given that the VSI signal in the time domain is also a function of the upper surface height h, the nonlinear fitting of the timedomain signal can also be used to determine the upper surface height of the film, when the residual χ(h) is minimum, which can be described as: where I m (z) is the measured time-domain signal, and I t (h, z) is the theoretical timedomain signal obtained by bringing the thickness d from the frequency-domain fitting into Equation (4). We simulated the signal with an upper surface height of h = −6 µm and a film thickness of 400 nm. Figure 4a shows the residuals χ(h) of the time-domain fitting, and as shown in Figure 4b, there were many local minima that could cause incorrect results if the initial fitting value was not suitable.

Principle of the Time-Domain Fitting
After determining the film thickness d, the height of the upper surface h is determined to reconstruct the surface topography of the film. Given that the VSI signal in the time domain is also a function of the upper surface height h, the nonlinear fitting of the timedomain signal can also be used to determine the upper surface height of the film, when the residual (ℎ) is minimum, which can be described as: where ( ) is the measured time-domain signal, and (ℎ, ) is the theoretical timedomain signal obtained by bringing the thickness from the frequency-domain fitting into Equation (4). We simulated the signal with an upper surface height of ℎ = −6 μm and a film thickness of 400 nm. Figure 4a shows the residuals (ℎ) of the time-domain fitting, and as shown in Figure 4b, there were many local minima that could cause incorrect results if the initial fitting value was not suitable.
(a) (b) Each ℎ in the search range was substituted into the model to calculate the residual of the theoretical and measured values according to Equation (7). However, Equation (3) shows that the time-domain fitting is a process whereby the measured and theoretical values are shifted to the maximum degree of coincidence. Therefore, the cross-correlation Each h in the search range was substituted into the model to calculate the residual of the theoretical and measured values according to Equation (7). However, Equation (3) shows that the time-domain fitting is a process whereby the measured and theoretical values are shifted to the maximum degree of coincidence. Therefore, the cross-correlation function can be applied to determine the height value, which can be written as: As shown in Figure 5, when processing the experimental data, we calculated one theoretical value for the initial value of the height. Then, we shifted it and calculated the degree of coincidence (cross-correlation coefficient R(m)) between the theoretical and measured values through the cross-correlation function. When the cross-correlation coefficient was maximized, the offset was added to the original initial value of the height to obtain the value of the height. Compared with Equation (7), only one theoretical value needed to be calculated, thereby accelerating the calculation. However, the accuracy of the height value obtained was limited by the scanning step. Equation (8) can be used to quickly determine an initial height value near the global optimal value, and a more accurate height value can be obtained by Equation (7). Noises with signal-to-noise ratios (SNRs) of 35, 30, and 25 dB were added to the simulation signal (the height was −6 μm, and the thickness was 300 nm), and the results for 10 simulation measurements are given in Table 2. The SNR during the experiment was typically 30-40 dB. Table 2 shows that this method is insensitive to noise when measuring the height. After obtaining the film thickness and the upper surface height, the height of lower surface can be calculated by subtracting the thickness of the film from the height of the upper surface ℎ. The upper and lower surfaces of the film can be reconstructed. Noises with signal-to-noise ratios (SNRs) of 35, 30, and 25 dB were added to the simulation signal (the height was −6 µm, and the thickness was 300 nm), and the results for 10 simulation measurements are given in Table 2. The SNR during the experiment was typically 30-40 dB. Table 2 shows that this method is insensitive to noise when measuring the height. After obtaining the film thickness and the upper surface height, the height of lower surface can be calculated by subtracting the thickness of the film d from the height of the upper surface h. The upper and lower surfaces of the film can be reconstructed.

System Structure
Before measurement, the system parameters in Equations (3) and (4) must be calibrated. First, the spectrum of the light reflected from the reference mirror (corresponding to I re f (k) in Equation (3)) was measured by a spectrometer (QE Pro; Ocean Optics, Dunedin, USA) when the measuring optical path is blocked, and the spectral response of the camera (corresponding to G(k) in Equation (4)) was obtained from the camera's manual; these are shown in Figure 6a.
Second, the calibration coefficient c(k) was obtained by Equation (9) using a bare silicon sample, where I silicon is the reflection spectrum of the silicon at the measuring path when the reference optical path is blocked, and R silicon is the reflectivity of silicon, shown as following:

Thickness Measurement
A stepped film of SiO2 deposited on a Si substrate (from Ocean Optics) was measured to verify the frequency-domain fitting method, and there were five regions with different thicknesses numbered 1-5 on the sample (Figure 7a). Table 3 lists the calibrated thickness values, the measured average thickness values, and the standard deviations obtained by repeated measurements at pixel (640, 512) of regions 1-5 using the proposed method. The errors between the mean values of the measurement and the calibration values were below 1.9%, and the relative standard deviations were below 1.7%. Thus, the accuracy and reliability of the proposed method for measuring film thickness have been demonstrated. Taking the measurement process at area 3 (where the calibrated film thickness was 298.65 nm) as an example, Figure 7a shows the VSI signal collected in this area. It is difficult to separate the peaks for the upper and lower surfaces using the CPD algorithms in the time domain. We collected 283 points in the time domain, and the frequencydomain resolution was improved through zero padding. In Figure 7b, the solid red line is the Fourier-transform-normalized amplitude of the VSI signal obtained from the experiment, and the wavenumber range was 1.33-2.5 μm −1 (400-750 nm). The black dashed line is the theoretical Fourier-transform-normalized amplitude, and the coefficient of correlation between the measured and theoretical values was 0.9970.
We also measured the thicknesses values of thicker films (from UniversityWafer, Inc.,

Thickness Measurement
A stepped film of SiO 2 deposited on a Si substrate (from Ocean Optics) was measured to verify the frequency-domain fitting method, and there were five regions with different thicknesses numbered 1-5 on the sample (Figure 7a). Table 3 lists the calibrated thickness values, the measured average thickness values, and the standard deviations obtained by repeated measurements at pixel (640, 512) of regions 1-5 using the proposed method. The errors between the mean values of the measurement and the calibration values were below 1.9%, and the relative standard deviations were below 1.7%. Thus, the accuracy and reliability of the proposed method for measuring film thickness have been demonstrated.    Taking the measurement process at area 3 (where the calibrated film thickness was 298.65 nm) as an example, Figure 7a shows the VSI signal collected in this area. It is difficult to separate the peaks for the upper and lower surfaces using the CPD algorithms in the time domain. We collected 283 points in the time domain, and the frequency-domain resolution was improved through zero padding. In Figure 7b, the solid red line is the Fourier-transform-normalized amplitude of the VSI signal obtained from the experiment, and the wavenumber range was 1.33-2.5 µm −1 (400-750 nm). The black dashed line is the theoretical Fourier-transform-normalized amplitude, and the coefficient of correlation between the measured and theoretical values was 0.9970.

Reference Normalized Amplitude Fitting CPD Algorithm Standard
We also measured the thicknesses values of thicker films (from UniversityWafer, Inc., Boston, MI, USA). Table 4 lists the results of repeated measurements at pixel (640, 512) obtained by the normalized amplitude fitting and the CPD algorithm. Because only the refractive index at the center wavelength of the light source was used in the CPD algorithm whereas normalized amplitude fitting used the information of the entire waveband, the thickness value obtained by the proposed method was more accurate. When the film thickness was less than 100 nm, the characteristic of the signal was very weak, and the measurement was sensitive to the calibration coefficient c(k) in the signal model. Table 5 lists the measurement results of five measurements at pixel (640, 512) for thinner films (from Filmetrics, Inc., San Diego, CA, USA) with and without calibration. A 5 × 5-pixel averaging filter was applied to each image to reduce the impact of noise as much as possible. In future work, the results will be improved by isolating environmental disturbances and using a more accurate scanning step during measurement.

Surface Topography Reconstruction
For the stepped film, at the junctions between areas of different thicknesses, the film was not as uniform as in other areas. There was a stepped height area on the upper surface, and the lower surface was a silicon plane. Thus, we proceeded to reconstruct the upper and lower surfaces of the region of interest by the time-domain fitting method. The cross-sectional morphologies of the junctions of different thicknesses shown in the red squares in Figure 8a were spliced together in Figure 8b, and more details of the junction between areas 1 and 2 (the red line in Figure 8a) are shown in Figure 8c. Because of the high lateral resolution of the full-field-of-view measurement, we can observe the transition of the surface topography at the junction. The range of the film thickness was approximately 400-500 nm, and the maximum and minimum thicknesses of the transition zone were 485.8 and 419.1 nm, respectively. As shown in Figure 8d, pixels in lines 310-1000 were selected for 3D surface topography reconstruction. The lower surface of the reconstructed film was almost a plane, which proved the accuracy of the height obtained by the timedomain fitting.

Conclusions
In this study, we have proposed a film measurement method based on time-and frequency-domain fitting of a VSI model. We exploited the fact that the signal shape of the Fourier-transform-normalized amplitude in the frequency domain is determined only by the film thickness. Thus, the film thickness was obtained through nonlinear fitting, and the fitting of the time-domain signal was used to reconstruct the upper surface of the film. Standard SiO2 films of different thicknesses were measured to verify the accuracy and reliability of this measurement method, and the surface topographies of the junctions between film steps with different thicknesses were measured, thereby enabling the 3D topographies of the upper and lower surfaces to be reconstructed.

Conclusions
In this study, we have proposed a film measurement method based on time-and frequency-domain fitting of a VSI model. We exploited the fact that the signal shape of the Fourier-transform-normalized amplitude in the frequency domain is determined only by the film thickness. Thus, the film thickness was obtained through nonlinear fitting, and the fitting of the time-domain signal was used to reconstruct the upper surface of the film. Standard SiO 2 films of different thicknesses were measured to verify the accuracy and reliability of this measurement method, and the surface topographies of the junctions between film steps with different thicknesses were measured, thereby enabling the 3D topographies of the upper and lower surfaces to be reconstructed.