E ﬃ cient Shape Estimation of Transparent Microdefects with Manifold Learning and Regression on a Set of Saturated Images

The paper provides a convenient method to determine if a suspected microdefect of polarizer real. Abstract: In the industry of polymer ﬁlm products such as polarizers, measuring the three-dimensional (3D) contour of the transparent microdefects, the most common defects, can crucially a ﬀ ect what further treatment should be taken. In this paper, we propose an e ﬃ cient method for estimating the 3D shape of defects based on regression by converting the problem of direct measurement into an estimation problem using two-dimensional imaging. The basic idea involves acquiring structured-light saturated imaging data on transparent microdefects; integrating confocal microscopy measurement data to create a labeled data set, on which dimensionality reduction is performed; using support vector regression on a low-dimensional small-set space to establish the relationship between the saturated image and defects’ 3D attributes; and predicting the shape of new defect samples by applying the learned relationship to their saturated images. In the discriminant subspace, the manifold of saturated images can clearly show the changing attributes of defects’ 3D shape, such as depth and width. The experimental results show that the mean relative error (MRE) of the defect depth is 3.64% and the MRE of the defect width is 1.96%. The estimation time consumed in the Matlab platform is less than 0.01 s. Compared with precision measuring instruments such as confocal microscopes, our estimation method greatly improves the e ﬃ ciency of quality control and meets the accuracy requirement of automated defect identiﬁcation. It is therefore suitable for complete inspection of products.


Introduction
Polarizers are core components of thin film transistor-liquid crystal display (TFT-LCD) panels widely used in display screens for products such as computers, mobile phones, digital cameras, and televisions. Typical polarizers are approximately 0.3 mm thick and composed of six transparent polymer films. During production and transportation, polarizers are prone to aesthetic defects such as impurities, scratches, stains, bubbles, dents, and residual glues. These defects may exist in any layer of a film, degrading the quality of the liquid crystal panels or even causing them to fail. Therefore, it is necessary to inspect each polarizer for aesthetic defects before attaching to screen panels.
With the increasing application of machine vision and artificial intelligence in factory automation, numerous engineering-based practices for TFT-LCD industry and film defect detection have emerged [1].
Lee et al. [2] reported a three-phase framework for embedding data mining and machine learning techniques for TFT-LCD manufacturing process. Kuo et al. [3] used a linear-array charge-coupled device camera to detect spot and line defects on polarizers. Kuo et al. [4] reported a neural network method to recognize six kinds of microdefects of color filters. Yen et al. [5] proposed a cost-effective optical detection system to detect tiny bump defects on polarizers. Their system inspects an area of 36mm × 27mm within 0.3 s. Mei et al. [6] proposed an unsupervised learning-based feature-level fusion approach to detect mura defects in TFT-LCD panels. In our previous work, a structured backlight was used to enhance the contrast of imaging, thus facilitating detection of polarizers' defects [7].
Despite all the effort, human visual inspection is nonetheless the primary method adopted in polarizer production. Besides the detection difficulty and the performance and efficiency requirement of the automated inspection, another important reason is that some detected microdefects are not real defects. These minor defects disappear or recover slowly after being attached to glass. Even human vision may fail to distinguish between real or not. It leads to a high false-positive rate. A subset of these defective samples is randomly sampled and sent for further inspection with some kind of microscopes, such as laser confocal microscopes. From the three-dimensional (3D) size information, we empirically tell if a microdefect is real or not. Obtaining the 3D contour of microdefects efficiently is thus a crucial task.
To this end, researchers have come up with solutions. Felice et al. [8] used ultrasonic-array total focusing technology to measure the depth of surface cracks (<1 mm). Hu et al. [9] reported a stereoscopic fluorescence profilometry to measure the thickness and morphology of transparent films. More recently, Fu et al. [10] proposed a new method of integrating robot arms and a confocal microscope to measure the surface roughness of objects. For high-precision applications, scanning electron microscopy can detect nanoscale defects [11]. Zhu et al. [12] used a circle-structure light to measure and inspect inner surface. However, the measurement range of these methods is extremely limited, and we also need to know the exact position of defects in advance. This is inefficient and unsuitable for industrial automated inspection. As far as we know, few studies have addressed the 3D estimation of polarizers' microdefects that take accuracy and efficiency into account simultaneously.
In our previous studies [7,[13][14][15], we have focused on an imaging mechanism for transparent microdefects, such as dents and bumps, and proposed a so-called saturated imaging method to obtain high-contrast defect images. Through simulation and experimentation, we have found that the saturated image of a defect varies regularly with its stereo shape. Inspired by research on object shape from image shading [16] and on astronomical object distance estimation using brightness [17], we have claimed that a defect's 3D shape and its saturated image are related. By finding this relationship through supervised learning, the defect's 3D attributes may be estimable.
On the basis of the previous discussion, we developed a method based on structured-light saturation imaging and regression on a shape manifold to determine the shape of internal defects of polarizers. The basic idea is first to project the samples of the original saturated image space onto a low-dimensional manifold subspace, from which the varying shape characteristics can be extracted through embedding analysis. Subsequently, a support vector regression model is established in this space to perform supervised learning with labeled samples. The new test samples can be projected onto the manifold subspace and then input into the trained regression model to estimate the defect shape. The results demonstrate the effectiveness of the proposed method for a small number of collected samples.
To our knowledge, this is the first 3D estimation method of microdefects applicable to complete inspection of transparent film products. The content of this article is organized as follows. Section 2 introduces the optical model and saturated imaging system. In Section 3, the proposed framework is described in detail. In Section 4, we explain the results and follow with analysis and discussion. The conclusion is drawn in Section 5.

Defect Optical Model and Saturated Imaging System
The aim of this study is to detect internal transparent microdefects in polarizers. The gray level of the convex or concave defects has a consistent relationship with the image distance [15]. One of our previous work [13] indicates that the transparent microdefects of polarizers can be approximated as a microscale planoconvex lens model, as shown in Figure 1a. Using structured-light illumination [7], we can effectively improve the imaging contrast at the position of microdefects and increase the detection accuracy.

Defect Optical Model and Saturated Imaging System
The aim of this study is to detect internal transparent microdefects in polarizers. The gray level of the convex or concave defects has a consistent relationship with the image distance [15]. One of our previous work [13] indicates that the transparent microdefects of polarizers can be approximated as a microscale planoconvex lens model, as shown in Figure 1a. Using structured-light illumination [7], we can effectively improve the imaging contrast at the position of microdefects and increase the detection accuracy. The optical simulation software TracePro is used to model and simulate the dents and bumps of a polarizer, as shown in Figure 1b. Further simulations demonstrated that the saturated imaging brightness varies with the defect depth. It is therefore feasible to infer the depth of the defects from information such as the intensity of the images. An example defect is shown in Figure 2. Based on the aforementioned transparent defect model, we established a corresponding backlit experimental environment to obtain high-quality grayscale images of target defects. The platform is composed primarily of a backlight, a polarizer (as a lens), and a camera ( Figure 3). In our previous study [15], the saturated imaging technique was innovatively applied to the detection of transparent microdefects, which could be accurately detected in black stripes by adjusting the light source intensity, exposure time, and camera gain. The optical simulation software TracePro is used to model and simulate the dents and bumps of a polarizer, as shown in Figure 1b. Further simulations demonstrated that the saturated imaging brightness varies with the defect depth. It is therefore feasible to infer the depth of the defects from information such as the intensity of the images. An example defect is shown in Figure 2.

Defect Optical Model and Saturated Imaging System
The aim of this study is to detect internal transparent microdefects in polarizers. The gray level of the convex or concave defects has a consistent relationship with the image distance [15]. One of our previous work [13] indicates that the transparent microdefects of polarizers can be approximated as a microscale planoconvex lens model, as shown in Figure 1a. Using structured-light illumination [7], we can effectively improve the imaging contrast at the position of microdefects and increase the detection accuracy. The optical simulation software TracePro is used to model and simulate the dents and bumps of a polarizer, as shown in Figure 1b. Further simulations demonstrated that the saturated imaging brightness varies with the defect depth. It is therefore feasible to infer the depth of the defects from information such as the intensity of the images. An example defect is shown in Figure 2. Based on the aforementioned transparent defect model, we established a corresponding backlit experimental environment to obtain high-quality grayscale images of target defects. The platform is composed primarily of a backlight, a polarizer (as a lens), and a camera ( Figure 3). In our previous study [15], the saturated imaging technique was innovatively applied to the detection of transparent microdefects, which could be accurately detected in black stripes by adjusting the light source intensity, exposure time, and camera gain. Based on the aforementioned transparent defect model, we established a corresponding backlit experimental environment to obtain high-quality grayscale images of target defects. The platform is composed primarily of a backlight, a polarizer (as a lens), and a camera ( Figure 3). In our previous study [15], the saturated imaging technique was innovatively applied to the detection of transparent microdefects, which could be accurately detected in black stripes by adjusting the light source intensity, exposure time, and camera gain.  We used the KEYENCE VK-X250K confocal microscope to measure the shape of the defects. The measurement time for a defect area of 1.1 mm × 1.1 mm is nearly 60 s. Therefore, this type of measuring instrument cannot meet efficiency requirements for complete inspection of products.
Using saturated imaging and confocal microscopy measurement, we obtained sample data. Some examples and corresponding information are shown in Table 1. Grey level is plotted against defect depth in Figure 4. The experimental results are consistent with the TracePro simulation, indicating that it is appropriate to infer depth or other attributes of defect shapes from the intensity of the image without changing other conditions. In Section 3 we discuss the proposed estimation methods.  We used the KEYENCE VK-X250K confocal microscope to measure the shape of the defects. The measurement time for a defect area of 1.1 mm × 1.1 mm is nearly 60 s. Therefore, this type of measuring instrument cannot meet efficiency requirements for complete inspection of products.
Using saturated imaging and confocal microscopy measurement, we obtained sample data. Some examples and corresponding information are shown in Table 1. Grey level is plotted against defect depth in Figure 4. The experimental results are consistent with the TracePro simulation, indicating that it is appropriate to infer depth or other attributes of defect shapes from the intensity of the image without changing other conditions. In Section 3 we discuss the proposed estimation methods.

Proposed Method for Defect Shape Estimation
This study proposes a new method for estimating defect's 3D contour, as shown in Figure 5. First, embedded analysis is used in the feature space to find a low-dimensional manifold space

Proposed Method for Defect Shape Estimation
This study proposes a new method for estimating defect's 3D contour, as shown in Figure 5. First, embedded analysis is used in the feature space to find a low-dimensional manifold space around the defect shape; then, a regression model is established in the manifold space, and the supervised learning is performed. For the new coming test sample, we project them onto the manifold space, and the defect shape attributes can be estimated using the trained regression model.

Proposed Method for Defect Shape Estimation
This study proposes a new method for estimating defect's 3D contour, as shown in Figure 5. First, embedded analysis is used in the feature space to find a low-dimensional manifold space around the defect shape; then, a regression model is established in the manifold space, and the supervised learning is performed. For the new coming test sample, we project them onto the manifold space, and the defect shape attributes can be estimated using the trained regression model.

Preprocessing and Feature Extraction
Saturated images or confocal microscopy images are certainly affected by noise. Image preprocessing is necessary. The images are mainly contaminated by additive noise, including background and Gaussian noise, as shown in Figure 6. Other abrupt changes in regions are caused mainly by manual marking or dust on the surface. Background can be corrected through differentiation from the fitting plane. Gaussian noise can be reduced by smoothing. As shown in Figure 6, preprocessing effectively filters out most of the noise.

Preprocessing and Feature Extraction
Saturated images or confocal microscopy images are certainly affected by noise. Image preprocessing is necessary. The images are mainly contaminated by additive noise, including background and Gaussian noise, as shown in Figure 6. Other abrupt changes in regions are caused mainly by manual marking or dust on the surface. Background can be corrected through differentiation from the fitting plane. Gaussian noise can be reduced by smoothing. As shown in Figure 6, preprocessing effectively filters out most of the noise. Considering that the useful information of the saturated image is the change of intensity along various directions, this study chose the histogram of gradient (HoG) method for feature extraction. The essence of HoG is the statistical information of the gradient direction of the image. HoG operates Considering that the useful information of the saturated image is the change of intensity along various directions, this study chose the histogram of gradient (HoG) method for feature extraction. The essence of HoG is the statistical information of the gradient direction of the image. HoG operates on the local grid unit of the image. Thus, the optical deformation can be kept reasonably invariant. The resolution of the defective region used in our experiments is 48 × 58. To avoid missing the details, we set 4 × 4 cells with step size 2.

Dimensionality Reduction
The defect shape attributes, such as width and depth, of the polarizers are crucial. We expect to obtain these attribute data from the saturated image. The dimensions of the image itself and the feature vector are very high. However, our sample size is extremely limited because of the difficulty of acquiring samples. In the original high-dimensional space, much redundant information and noise are included. The machine learning model is prone to over-fitting, leading to low generalization ability. An effective approach to alleviate this dimensionality problem is to use low-dimensional embedding in high-dimensional space. The variables in the low-dimensional space are likely to be the information desired by the study, such as the refractive index of the polarizers, defect width, and defect depth.
Suppose that the feature space of the defect area is represented by a set of image features, where D is the feature dimension and n is the number of samples. The corresponding set of labels for the true defect shape attribute is denoted by L = {l i : l i ∈ R} n i=1 . The goal of dimensionality reduction is to learn a one-to-one projection in the embedded subspace by projecting X onto a low-dimensional manifold subspace, say Y = y i : , where d is the new dimension after dimensionality reduction, and d D. This projection model can be formulated as Y = P(X, L), where P(·) represents a linear or nonlinear projection function. The projection function can be found through unsupervised or supervised learning methods [18].
Because of the small sample size, manifold embedding techniques including principal component analysis (PCA), isometric mapping (Isomap), local linear embedding (LLE), and Laplacian eigenmaps (LE) are feasible dimensionality reduction approaches. PCA [19] is a linear dimensionality reduction method that maps data along the direction of the variance maximization in the target space. Although this is a linear dimensionality reduction method, it is also compared with nonlinear ones because it is frequently used. Isomap [19] is a nonlinear extension of multidimensional scaling that maintains the distance between samples in low-dimensional space as close as possible to the distance in geodetic distance space. LLE [20] is also a nonlinear method that computes low-dimensional, neighborhood-preserving embedding of high-dimensional inputs, which can help the reduced data more effectively maintain its original structure. Similar to LLE, the basic concept of LE [19] is to use spectral techniques to obtain a relationship such that the adjacent points are as close as possible in the lower-dimensional space.
According to many researchers, estimation of out-of-samples can also be employed for projection onto a low-dimensional space; please refer to [21] for the details.

Regression
To estimate the defect shape characteristics (e.g., depth) of new samples, we can find a regression function in the manifold space to represent the relationship between the embedded features y i and the depth l i . A typical regression method is a polynomial model. The linear model is too simple, and third-order or higher-order models are prone to overfitting. Therefore, the quadratic model is a feasible choice; the estimated depth isl = w 0 + w T 1 y + w T 2 y 2 , where w 0 , w 1 , and w 2 are respectively the offset and first-and second-order coefficient vectors. They can be obtained by minimizing the distance between the true depth and estimated depth through least squares; that is, min However, methods such as least squares minimize the empirical risk function that is sensitive to noise.
Particularly with the small sample set in this study, these methods can easily lead to overfitting and poor generalization capacity. Support vector regression (SVR) [22] is preferable for its more robust regression of defect shape. SVR is a statistical learning theory that combines Vapnik Chervonenkis dimensions and minimizes structural risk. Given a training sample set (y i , l i ) n i=1 , SVR finds a function f (·) to make f (Y) and L as close as possible. Based on an ε − insensitive loss function, SVR can tolerate a maximum of ε deviation between f (y i ) and l i . Thus, SVR is less sensitive to outliers than are other methods and is suitable for our small sample set. To address the underfitting problem of small data set, we use nonlinear kernel to increase the SVR complexity and use more feature dimensions as well. Too many dimensions, on the contrary, may lead to overfit. This is discussed in Section 4.2.

Results and Discussion
In this study, the saturated imaging method was used to image the defect samples in the black stripes of binary structured backlight, and the saturated images were cropped to obtain defective region samples with sizes of 48 × 58. An example is shown in Figure 7a,b. All these samples form the unlabeled sample set X = {x i } n i=1 . Subsequently, a laser confocal microscope was used to measure the defect shape attributes. The main settings are an objective lens magnification of 10×, horizontal and vertical measurement ranges of 0 − 1417 µm and 0 − 1062 µm, respectively, a depth measurement repeatability of 0.1 µm, and a width measurement repeatability of 0.2 µm. The measured values were used as ground truth to analyze our estimation method. The defect shape attributes were labeled manually. Without loss of generality, only depth and width were considered in our experiment. Figure 7c illustrates the manual labeling to obtain label set L = {l i } n i=1 . Finally, 113 samples were obtained (The dataset can be found in the following page. https://doi.org/10.4121/uuid:9b562ad4-6f92-48ba-8883-95921d3b0efb). The minimum and maximum depths of defects in these samples are 2.0 µm and 18.5 µm, respectively. The minimum and maximum widths of defects are 250 µm and 500 µm, respectively.

Manifold Visualization
We applied dimensionality reduction to the 113 samples by using the PCA, Isomap, LLE, and LE methods. Figures 8 and 9, respectively, depict the two-dimensional (2D) and 3D visualizations;

Manifold Visualization
We applied dimensionality reduction to the 113 samples by using the PCA, Isomap, LLE, and LE methods. Figures 8 and 9, respectively, depict the two-dimensional (2D) and 3D visualizations; each data point represents a saturated image, and different marker values for depth and width are indicated by color from dark blue (small value) to yellow (large value). The position of the data points is consistent between the samples for defect depth and width. This is because the unsupervised dimensionality reduction method is used without labeled data. The only difference is the value of the data points.

Manifold Visualization
We applied dimensionality reduction to the 113 samples by using the PCA, Isomap, LLE, and LE methods. Figures 8 and 9, respectively, depict the two-dimensional (2D) and 3D visualizations; each data point represents a saturated image, and different marker values for depth and width are indicated by color from dark blue (small value) to yellow (large value). The position of the data points is consistent between the samples for defect depth and width. This is because the unsupervised dimensionality reduction method is used without labeled data. The only difference is the value of the data points.  Regardless of the dimensionality of the visualization, the results of each algorithm exhibit a certain trend and structure for depth and width values despite the small dataset. This clear discriminant pattern is favorable for the subsequent regression. We find that after using 40 features, as shown in Figure 10, the regression performance is hindered by the addition of more predictors. Therefore, data dimensionality reduction is necessary on the saturated image feature space. Regardless of the dimensionality of the visualization, the results of each algorithm exhibit a certain trend and structure for depth and width values despite the small dataset. This clear discriminant pattern is favorable for the subsequent regression. We find that after using 40 features, as shown in Figure 10, the regression performance is hindered by the addition of more predictors. Therefore, data dimensionality reduction is necessary on the saturated image feature space. Regardless of the dimensionality of the visualization, the results of each algorithm exhibit a certain trend and structure for depth and width values despite the small dataset. This clear discriminant pattern is favorable for the subsequent regression. We find that after using 40 features, as shown in Figure 10, the regression performance is hindered by the addition of more predictors. Therefore, data dimensionality reduction is necessary on the saturated image feature space.

Estimation of Defect Depth and Width
In this subsection, we explain the application of the support vector regression model mentioned in Section 3.3 for estimating the depth and width of the defect, and we analyze the error.
To evaluate the performance of a support vector regression model, leave-one-out crossvalidation is [23] applicable in situations where the sample set is small. For each of the

Estimation of Defect Depth and Width
In this subsection, we explain the application of the support vector regression model mentioned in Section 3.3 for estimating the depth and width of the defect, and we analyze the error.
To evaluate the performance of a support vector regression model, leave-one-out cross-validation is [23]  The input features of the regression model are the points in the low-dimensional manifold space Y = y i : y i ∈ R d n i=1 resulting from reduction of the sample space X = x i : x i ∈ R D n i=1 by using PCA, Isomap, LLE, or LE (see Figure 5 for the flowchart). The choice of the insensitivity parameter ε in the SVR model greatly influences performance. If ε is too small, it can lead to too many support vectors, and the model can overfit. If ε is too high, too few support vectors can lead to underfitting. According to the result in Figure 11, ε = 1 is a suitable choice for all four dimensionality reduction methods. They all obtain lower MAE when the first 20 dimensional features are input; therefore, we fixed this parameter in later experiments. Considering the depth and width of the defect respectively, as shown in Figure 10, we compared the MAE and SAE of the SVR model with different dimensionality reduction methods and different numbers of features. These comparisons almost uniformly reflect the following rules. For both defect depth ( Figure 10a) and defect width (Figure 10c), all MAEs of the four combined methods (i.e., PCA + SVR, Isomap + SVR, LLE + SVR, and LE + SVR) show a downward trend with the addition of 2 to 20 features. This is because the more features that are used, the richer the discriminant information is. Conversely, after 60 features, excessive redundancy of information hinders the regression performance. Additionally, the MAEs of the four combined methods increase slowly. Therefore, it is ideal to select 20 to 60 dimensional features. As shown in Figure 10b,d, by contrast, the SAEs of the four methods present a gradual downward trend. When the feature dimensions exceed 20, the SAEs of the four methods are relatively small and stable.
Isomap + SVR has a comparative advantage over the other three methods when estimating defect depth. Figure 10a shows that the MAE of the defect depth can be as small as 1.5 µ m. When estimating the defect width, Isomap + SVR is slightly more effective than LLE + SVR and LE + SVR. It retains a substantial advantage over PCA + SVR, and the MAE could be as small as 12 µ m. The relative error is discussed in the following paragraphs. Figure 12 illustrates the curves of cumulative accuracy score versus error level for estimating defect depth and width. Cumulative accuracy score is defined as follows: where n is the total number of test samples and err el n  indicates the number of test samples whose absolute error err is less than the error level el . The curve reveals the cumulative ratio of the estimated result below a certain error level. Accordingly, the closer the curve is to the upper left corner of the axis, the more accurate is the prediction. Considering the depth and width of the defect respectively, as shown in Figure 10, we compared the MAE and SAE of the SVR model with different dimensionality reduction methods and different numbers of features. These comparisons almost uniformly reflect the following rules. For both defect depth ( Figure 10a) and defect width (Figure 10c), all MAEs of the four combined methods (i.e., PCA + SVR, Isomap + SVR, LLE + SVR, and LE + SVR) show a downward trend with the addition of 2 to 20 features. This is because the more features that are used, the richer the discriminant information is. Conversely, after 60 features, excessive redundancy of information hinders the regression performance. Additionally, the MAEs of the four combined methods increase slowly. Therefore, it is ideal to select 20 to 60 dimensional features. As shown in Figure 10b,d, by contrast, the SAEs of the four methods present a gradual downward trend. When the feature dimensions exceed 20, the SAEs of the four methods are relatively small and stable.
Isomap + SVR has a comparative advantage over the other three methods when estimating defect depth. Figure 10a shows that the MAE of the defect depth can be as small as 1.5 µm. When estimating the defect width, Isomap + SVR is slightly more effective than LLE + SVR and LE + SVR. It retains a substantial advantage over PCA + SVR, and the MAE could be as small as 12 µm. The relative error is discussed in the following paragraphs. Figure 12 illustrates the curves of cumulative accuracy score versus error level for estimating defect depth and width. Cumulative accuracy score is defined as follows: where n is the total number of test samples and n err≤el indicates the number of test samples whose absolute error err is less than the error level el. The curve reveals the cumulative ratio of the estimated result below a certain error level. Accordingly, the closer the curve is to the upper left corner of the axis, the more accurate is the prediction.   Figure 12a,b compares the performance of dimensionality reduction to 5, 20, 40, 70, and 110 dimensions when using the Isomap + SVR method. The performance is greatly improved from 5 to 20 dimensions. Starting from 70 dimensions, however, it gradually degrades. The results indicate that the performance of 20 and that of 40 dimensions are similar. Thus, the range from 20 to 40 is preferable, which is in agreement with the MAE versus dimension curve in Figure 10. Figure 12a,b respectively  Figure 12a,b compares the performance of dimensionality reduction to 5, 20, 40, 70, and 110 dimensions when using the Isomap + SVR method. The performance is greatly improved from 5 to 20 dimensions. Starting from 70 dimensions, however, it gradually degrades. The results indicate that the performance of 20 and that of 40 dimensions are similar. Thus, the range from 20 to 40 is preferable, which is in agreement with the MAE versus dimension curve in Figure 10. Figure 12a,b respectively shows that the depth estimation error is less than 1 µm (approximately 5.4% relative to the maximum depth) in 85% of the samples when the dimensions are reduced to 40, and that the width estimation error is less than 30 µm (approximately 5.5% relative to the maximum width) in 80% of the samples. Figure 12c,d displays the performance of four dimensionality reduction + SVR methods and SVR alone when the data are reduced to 20 dimensions. If dimensionality reduction is not executed, the number of support vectors obtained through SVR training is approximately 80. The number of support vectors is approximately 60 when we reduce the sample dimensions to 20. The performance of SVR alone without dimensionality reduction is poorer than with dimensionality reduction. This is because more support vectors probably lead to overfitting. For both depth estimation and width estimation, Isomap achieves the optimal results among the four dimensionality reduction methods. PCA is a linear dimensionality reduction method that cannot represent the internal structure of data. Isomap is a typical nonlinear dimensionality reduction method, and it could more effectively learn the intrinsic variables, such as defect depth and width, in a set of saturated images. Figure 12e,f compares the performance of the four dimensionality reduction methods plus quadratic regression (QR) in the case of data dimensionality reduction to 20 dimensions. All combination methods in Figure 12c,d are clearly more effective than those in Figure 12e,f. This is a result of the small sample size. The QR model is extremely sensitive to outliers and noise, which leads to serious overfitting of the model and poor generalization ability.
When considering dimensionality reduction to only 20 dimensions, the MREs of defect depth and width are calculated separately. The results are shown in the Table 2. All relative errors have a small variance value, indicating that the relative error and absolute error distributions are concentrated. This is consistent with the curves near the upper left corner of the axis shown in Figure 12a-d. The MRE of Isomap+SVR is approximately 3.64% when estimating the depth of defects and 1.96% when estimating the width. This accuracy can greatly help polarizer and LCD panel manufacturers identify defects. Regardless of the dimensionality reduction technique and regression method chosen, there is always considerable estimation error for the two following reasons. Firstly, the labeling of the sample is completed manually. The labeling results for the defect shape attributes are slightly subjective. As shown in Figure 7, determination of the defect depth is more difficult than that of the width. This is also why the MRE of the depth estimation is generally larger than that of the width estimation, as shown in Table 1. Secondly, it is difficult to collect defective samples. A few outliers have significant influences, which is a typical problem of small samples. Therefore, the estimation performance has room for improvement.

Discussion
The proposed method was implemented in Matlab 2017b (i7-4770, 8G RAM, Windows 7 system). The estimating time of defect depth and width was less than 0.01s. This speed can be improved by implementing the method in C or C++. Compared with the measurement time of the laser confocal microscope of approximately 60s, the proposed method has much higher efficiency, while retaining accuracy that meets the manufacturer requirements for identifying defects. Therefore, the proposed method successfully trades little performance for a substantial increase in efficiency as well as savings in hardware and labor cost.
However, there is still much room for improving the performance. The most significant impact on this result is due to the small set of defective samples. To mitigate the impact, we will refer to the data augmentation techniques commonly used for deep neural networks, such as adding noise and affine transformation. Generative adversarial network (GAN) is also a promising tool for generating meaningful samples [24].
About the defect imaging using saturated imaging method, the contrast is very sensitive to the saturation degree, which requires careful manual selection. It is an important future work to develop an easily deployed method to automatically get the optimal saturation degree.
For the purpose of industrial application, we will encounter some problems that need to be further explored. For example, our training samples are within the 48 × 58 range centered on defect position. The process of defect detection should thus be performed before the 3D shape estimation. On the other hand, defect detection is seriously affected by the contrast of defect areas. According to our previous work [15], however, saturation level needs to be carefully selected to get the optimal contrast. Therefore, we should also clarify the latent relation between the saturation level and the contrast.

Conclusions
This study developed a framework based on manifold learning and pattern regression to estimate the shape characteristics, such as depth and width, of internal transparent microdefects of polarizers. Low-dimensional visualization shows that dimensionality reduction could effectively extract the trends and structure of defect shape changes. Estimation performance indicates the effectiveness of the SVR model in a low-dimensional space. The proposed method can trade some accuracy for a substantial increase in efficiency and savings in production costs. It assists with defect detection in comprehensive inspection during polarizer production, improves the qualified rate of products, and reduces the costs of materials and labor. The studied method can also be extended to the other kinds of transparent film. In future work, we will develop a high-definition saturated imaging technique for more reliable feature estimation. Besides data augmentation, we are also interested in acquiring more defective samples and using neural networks for a possible improvement in estimation performance.