Feature Extraction with Discrete Non-Separable Shearlet Transform and Its Application to Surface Inspection of Continuous Casting Slabs

Featured Application: First, the surface defect inspection characteristics of continuous casting slab is that the slab moves slowly on the production line, so the feature extraction method does not need too fast a calculation speed. Second, the inspection di ﬃ culties of continuous casting slabs are the defect with complex backgrounds, so some common feature extraction methods cannot meet these needs. DNST (discrete non-separable shearlet transform) is a new multiresolution analysis method with moderate computing speed. It can extract images information from multiple scales and directions. Therefore, this paper proposed a DNST-GLCM-KSR (discrete non-separable shearlet transform-gray-level co-occurrence matrix-kernel spectral regression) feature extraction method. The method is suitable for surface defects inspection with complex background and moderate running speed of production line. Abstract: A new feature extraction technique called DNST-GLCM-KSR (discrete non-separable shearlet transform-gray-level co-occurrence matrix-kernel spectral regression) is presented according to the direction and texture information of surface defects of continuous casting slabs with complex backgrounds. The discrete non-separable shearlet transform (DNST) is a new multi-scale geometric analysis method that provides excellent localization properties and directional selectivity. The gray-level co-occurrence matrix (GLCM) is a texture feature extraction technology. We combine DNST features with GLCM features to characterize defects of the continuous casting slabs. Since the combination feature is high-dimensional and redundant, kernel spectral regression (KSR) algorithm was used to remove redundancy. The low-dimension features obtained and labels data were inputted to a support vector machine (SVM) for classiﬁcation. The samples collected from the continuous casting slab industrial production line—including cracks, scales, lighting variation, and slag marks—and the proposed scheme were tested. The test results show that the scheme can improve the classiﬁcation accuracy to 96.37%, which provides a new approach for surface defect recognition of continuous casting slabs.


Introduction
At present, machine vision-based surface inspection technology has been widely used in the detection and identification of surface defects of various industrial products due to its non-contact and real-time detection properties [1]. The machine vision-based detection method is to collect the image of the industrial product under the irradiation of the high-intensity light source and use the image processing and pattern recognition algorithm to analyze the surface image [2]. For different industrial products, one need consider defect image features of the products themselves and then adopt appropriate recognition methods.
In the production process of continuous casting slabs, defects often occur due to various factors like raw material, preprocessing technologies, etc. The defects will have a negative impact on the next rolling process, and severe defects will even lead to the scrapping of entire slabs [3,4]. The defect feature extraction method plays an important role in defect inspection, which is one of the hotspots in the research on surface defect recognition algorithms. The most important characteristic of surface defects of continuous casting slab are complex backgrounds, which make recognition difficult.
At present, research is more active on strip steel products with a simple background image [5][6][7]. However, the defect recognition of continuous casting slabs with complex backgrounds has received comparatively little attention. Wei SY et al. [3] extracted the shape feature values of the image to classify and recognition defects. Yun [8] proposed a surface defect recognition algorithm based on Gabor wavelet that can detect the fine cracks and angular cracks on the surface of the slabs by minimizing the cost function of the energy separation criterion for the defect area and the defect-free area. Pan E [9] proposed an engineering-driven rule-based detection (ERD) method according to the mechanism of deep longitudinal crack and transverse crack on slabs. Xu K et al. [10] used non-sampled wavelet to decompose the surface image by calculating the scale co-occurrence matrix and grayscale co-occurrence matrix, and used AdaBoosting classifier to identify cracks from water marks, slag marks, scales, and vibration marks. Subsequently, the author proposed combining the discrete shearlet transform (DST) and kernel local preservation projection (KLPP) algorithm to extract surface defect features [2]. Y. Ai utilized the combination of Contourlet transform and kernel local preservation projection (KLPP) algorithm to extract the defect features [1], then Xu K [11] improved the above method by introducing a texture feature. Si Yang [12] improved the local binary pattern and proposed a multi-block local binary pattern (MB-LBP) feature extraction method.
Of the above mentioned methods, the wavelet-based feature extraction (for example, references [1,2,8,10,11]) is the more effective and more studied technology. Although these methods have achieved some results, the recognition accuracy of surface defects of continuous casting slabs needs to be further improved with the increasingly strict quality requirements of users.
Discrete nonseparable shearlet transform (DNST) [13,14] is a new kind of wavelet-based method. It is a compactly supported shearlet transform with excellent localization properties in the spatial domain and excellent directional selectivity. DNST has been successfully introduced in the fields of compressed sensing magnetic resonance imaging [15,16]. According to defect images of continuous casting slabs with the scale and directionality traits, DNST was introduced into the surface defect feature extraction of continuous casting slabs. The gray-level co-occurrence matrix (GLCM) is an effective texture feature extraction method that can reflect the comprehensive information of the image gray in direction, adjacent pixel interval, and gray level variation [17]. Some defects images of continuous casting slabs also have texture traits, thus we consider introducing GLCM into the feature extraction of continuous casting slabs. Since the features extracted by DNST and GLCM are redundant, we use kernel spectral regression (KSR) [18,19] technology to remove redundant features. KSR is a kind of manifold learning dimensionality reduction technology, and it casts the problem of learning an embedding function into a regression framework that facilitates efficient computations. The proposed feature extraction approach is named discrete nonseparable shearlet transform gray-level co-occurrence matrix kernel spectral regression (DNST-GLCM-KSR), which combines multi-scale and multi-directional features of DNST with texture features of GLCM and uses KSR to remove redundant features. The DNST-GLCM-KSR approach can improve the defect recognition accuracy of continuous casting slabs and achieved better performance than traditional methods.
The novelty of our work lies in introducing DNST into the surface defect recognition of continuous casting slabs with the complex backgrounds, fusing GLCM texture features, and using a suitable dimensionality reduction algorithm KSR, which makes defect recognition easier and more effective. The rest of this paper is organized as follows. In Section 2, the surface defects information of continuous casting slabs is depicted. Section 3 introduces the basic principles of the DNST-GLCM-KSR feature extraction approach. The surface defect recognition algorithm is presented in Section 4. Section 5 describes the experimental results and discussions, followed by conclusions in Section 6.

The Characteristics of Defect Images
The surface temperature of continuous casting slabs is very high during the production process. The temperature can reach about 800~900 • C, which results in the surface being oxidized and forming a large number of various shapes scales [1]. The scales seriously interfere with the detection and recognition of defects. At the same time, due to insufficient illumination in the production site and the rough surface of continuous casting slabs, the slab surface appears uneven. Also, in the continuous casting slab production process, the quality of surface images collected is decreased due to the splash of cooling water, rolling mill vibrations, and other factors. Figure 1 shows several common surface images of continuous casting slabs, including cracks, scales, lighting variations, and slag marks. The cracks are true defect, while the scales, lighting variations, and slag marks are the interference factors that may lead to misclassification. The interference factors are labeled as false defects and also as a type of recognition object. The main task of continuous casting slabs is to recognize crack defects in the complicated background and interference factors.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 3 of 13 of continuous casting slabs is depicted. Section 3 introduces the basic principles of the DNST-GLCM-KSR feature extraction approach. The surface defect recognition algorithm is presented in Section 4. Section 5 describes the experimental results and discussions, followed by conclusions in Section 6.

The Characteristics of Defect Images
The surface temperature of continuous casting slabs is very high during the production process. The temperature can reach about 800~900 °C , which results in the surface being oxidized and forming a large number of various shapes scales [1]. The scales seriously interfere with the detection and recognition of defects. At the same time, due to insufficient illumination in the production site and the rough surface of continuous casting slabs, the slab surface appears uneven. Also, in the continuous casting slab production process, the quality of surface images collected is decreased due to the splash of cooling water, rolling mill vibrations, and other factors. Figure 1 shows several common surface images of continuous casting slabs, including cracks, scales, lighting variations, and slag marks. The cracks are true defect, while the scales, lighting variations, and slag marks are the interference factors that may lead to misclassification. The interference factors are labeled as false defects and also as a type of recognition object. The main task of continuous casting slabs is to recognize crack defects in the complicated background and interference factors.   Figure 1a is longitudinal crack defect sample, abbreviated as cracks. The cracks are mainly along the longitudinal distribution of the slabs, the shape is curved, and the length ranges from a few centimeters to dozens of centimeters. The cracks usually have a certain depth, which shows a different grayscale value than the surrounding pixels in the intense light irradiation. The occurrence of cracks is mostly related to the high-temperature steel and various mechanical behaviors in the solidification process. A crack is a very serious defect. Figure 1b is a scales sample. The shape of scales is uncertain and varies greatly. Sometimes, some of the scales are warped, but most scales are attached to the surface of the slabs and show texture features. Due to the scales covering the surface of the slabs, some fine crack defects are hard to identify. Figure 1c is the lighting variations. The bright and dark areas are caused by the irradiation of multiple light sources or changes in illumination intensity. The boundaries of the bright and dark areas show straight line shapes, and the gray values on both sides are noticeably different. Therefore, these boundaries are sometimes misclassified as cracks.  Figure 1a is longitudinal crack defect sample, abbreviated as cracks. The cracks are mainly along the longitudinal distribution of the slabs, the shape is curved, and the length ranges from a few centimeters to dozens of centimeters. The cracks usually have a certain depth, which shows a different grayscale value than the surrounding pixels in the intense light irradiation. The occurrence of cracks is mostly related to the high-temperature steel and various mechanical behaviors in the solidification process. A crack is a very serious defect. Figure 1b is a scales sample. The shape of scales is uncertain and varies greatly. Sometimes, some of the scales are warped, but most scales are attached to the surface of the slabs and show texture features. Due to the scales covering the surface of the slabs, some fine crack defects are hard to identify. Figure 1c is the lighting variations. The bright and dark areas are caused by the irradiation of multiple light sources or changes in illumination intensity. The boundaries of the bright and dark areas show straight line shapes, and the gray values on both sides are noticeably different. Therefore, these boundaries are sometimes misclassified as cracks. Figure 1d is the slag marks. They are mainly formed by the residual slag. Slag marks are also distributed along the longitudinal direction of the slabs with a certain width, and their gray values are lower than that of the surrounding pixels. These images are easily misclassified as crack defects.

Basic Principles of the Proposed Method
The feature extraction method is the core of the defect recognition algorithm. The quality of feature extraction directly affects the results of defect recognition. The proposed feature extraction scheme DNST-GLCM-KSR in this paper utilizes three technologies, including DNST, GLCM, and KSR, which are described in detail as follows.

Discrete Nonseparable Shearlet Transform
The wavelet analysis method has been applied in many fields due to its advantages of multi-scale decomposition and fast computation speed. However, the drawback of the wavelet is that the direction decomposition is insufficient. It can only decompose the horizontal, vertical, and diagonal directions. In order to compensate for the deficiency of directional decomposition, the multi-scale geometric analysis (MGA) method was proposed. Typical multi-scale geometric analysis methods include Ridgelet transform [20], Curvelet transform [21], Contourlet transform [22], Shearlet transform [23], and Bandelet transform [24]. The most widely used ones are Contourlet transform and Shearlet transform, while the other methods are limited by their slow computational speed. The advantage of Contourlet is fast computational speed, but its direction representation is limited. The computational speed of Shearlet transform is slower than that of Contourlet, but the direction representation is more flexible. The continuous casting slabs run very slowly on the track, with a speed less than 1 m/s, so the computational speed of the defect recognition algorithm is not required to be fast. The discrete nonseparable shearlet transform (DNSTS) [13,14] is a new kind of shearlet transform.
W.Q. Lim [13] proposed discrete nonseparable shearlet transform (DNST) based on the discrete frame. DNST is constructed from a 2D nonseparable fan filter (improved directional selectivity) and a separable compactly supported shearlet generator (excellent localization properties). It is a direction representation system that extends the wavelet frame. DNST exhibits the same advantages wavelet, namely a unified treatment of the continuum and digital situation.
Two-dimensional discrete shearlet transform is usually obtained using a cone adapted discrete shearlet system defined by scaling functions φ m and shearlet functions ψ j,k,m , ψ j,k,m (by swapping the order of two variables of ψ j,k,m ).
where j is scaling parameter, m is translation parameter, and k is shear (direction) parameter. Non-separable generator ψ non is defined as follows: where the trigonometric polynomial P is a 2D fan filter, which can improve directional selectivity in the frequency domain at each scale. ψ is the 2D separable shearlet generator. which can provide excellent localization properties.ψ is the Fourier transform of ψ. The nonseparable shearlets ψ non j,k,m (x) generated by ψ non by setting ψ non j,k,m (x) = 2 same procedure can be applied to compute the shearlet coefficients associated with A 2 j and S k . After faithfully digitizing ψ non j,k,m , the digital formulation of the discrete nonseparable shearlet transform (DNST) is given by is the discrete shear operator, p j is the Fourier coefficients of P, and W j = g J−j ⊗ h J−j/2 , g J−j and h J−j/2 are 1D filters. Please refer to reference [13] for details.
The frequency tiling induced by such discrete shearlet system is shown in Figure 2a, whereφ, ψ non andˆ ψ non are associated with the square in the center, the horizontal cone (white), and the vertical cone (yellow), respectively. Each scale corresponds to a ring of tiles and shear is associated with a pair of tiles in a certain direction within the ring. With a proper choice of the parameters associated with the translation, the DNST is obtained as a series of filtering operations. Each shearlet function has two symmetric tiles. The magnitude of the shearlet filter is shown in Figure 2b,c. Figure 2b is the frequency tiles of a DNST filter corresponding to the first scale. Figure 2c is the frequency tiles of a DNST filter corresponding to the second scale.
The frequency tiling induced by such discrete shearlet system is shown in Figure 2a, where ̂ , ̂a nd ̃ are associated with the square in the center, the horizontal cone (white), and the vertical cone (yellow), respectively. Each scale corresponds to a ring of tiles and shear is associated with a pair of tiles in a certain direction within the ring. With a proper choice of the parameters associated with the translation, the DNST is obtained as a series of filtering operations. Each shearlet function has two symmetric tiles. The magnitude of the shearlet filter is shown in Figure 2b,c. Figure  2b is the frequency tiles of a DNST filter corresponding to the first scale. Figure 2c is the frequency tiles of a DNST filter corresponding to the second scale.

Gray-Level Co-Occurrence Matrix
Gray-level co-occurrence matrix (GLCM) is an effective texture feature extraction approach. GLCM considers not only the distribution of intensities but also the relative positions of pixels in an image [25]. Let be an operator that defines the position of two pixels relative to each other, and consider an image, , with possible intensity levels. Let be a matrix whose element is the number of times that pixel pairs with intensities and ℎ occur in in the position specified by . A matrix formed in this manner is referred to as a gray-level co-occurrence matrix. Generally, GLCM is not directly regarded as a texture feature, but it is represented by some descriptors such as energy, contrast, entropy, homogeneity, and correlation.

Kernel Spectral Regression
Kernel spectral regression (KSR) is a dimensionality reduction method based on manifold learning and subspace [18]. The KSR assumes that the original data is embedded in the lowdimensional manifold of the high-dimensional observation space, and each sample is kept adjacent to it by the manifold learning algorithm, so as to mine the low-dimensional manifold structure contained in the high-dimensional data. KSR only needs to solve a set of regularized least squares problems, which results in huge savings of both time and memory. KSR can make efficient use of label and local neighborhood information to discover the intrinsic discriminant structure in the data. The algorithmic procedure is stated below.

Gray-Level Co-Occurrence Matrix
Gray-level co-occurrence matrix (GLCM) is an effective texture feature extraction approach. GLCM considers not only the distribution of intensities but also the relative positions of pixels in an image [25]. Let Q be an operator that defines the position of two pixels relative to each other, and consider an image, I f , with possible intensity levels. Let G be a matrix whose element is the number of times that pixel pairs with intensities z v and z h occur in I f in the position specified by Q. A matrix formed in this manner is referred to as a gray-level co-occurrence matrix. Generally, GLCM is not directly regarded as a texture feature, but it is represented by some descriptors such as energy, contrast, entropy, homogeneity, and correlation.

Kernel Spectral Regression
Kernel spectral regression (KSR) is a dimensionality reduction method based on manifold learning and subspace [18]. The KSR assumes that the original data is embedded in the low-dimensional manifold of the high-dimensional observation space, and each sample is kept adjacent to it by the manifold learning algorithm, so as to mine the low-dimensional manifold structure contained in the high-dimensional data. KSR only needs to solve a set of regularized least squares problems, which results in huge savings of both time and memory. KSR can make efficient use of label and local neighborhood information to discover the intrinsic discriminant structure in the data. The algorithmic procedure is stated below.
(1) Constructing the adjacency graph. Let G denote a graph with m nodes. The i-th node corresponds to the sample x i . If x i shares the same label with x j , put an edge between nodes i and j. (2) Choosing the weights: , i f x i and x j both belong to the k − th class where W ij is the weight of the edge joining vertices i and j. (3) Responses generation. Find y 0 , y 1 ,· · · , y c−1 , the largest c generalized eigenvectors of eigenproblem.
where D is a diagonal matrix whose (i, i)-th element equals to the sum of the i-th column of W, c is the number of classes. (4) Regularized kernel least squares. Find c − 1 vectors α 1,··· , α c−1 ∈ R m . α k (k = 1, · · · c − 1) is the solution the linear equations system. where is the solution of the following regularized kernel least square problem: where K(:, x) = [K(x 1 , x), K(x 2 , x), · · · , K(x m , x)] T .

Defect Recognition Algorithm
The defect recognition algorithm is the core of the surface quality inspection system. Generally, defect recognition algorithm consists of image preprocessing, image feature extraction, and image classification. In order to obtain more comprehensive information of the surface images of continuous casting slabs, we do not carry out image preprocess. Figure 3 is the schematic diagram of the defect recognition algorithm. The details are as follows. (4) Dimensionality reduction. Since the extracted features are high-dimensional and redundant, which is not conducive to subsequent classification. Therefore, we first use the KSR to reduce the dimensionality of the feature matrix of the training set and use the projection matrix obtained from the training set to reduce the dimensionality of the feature matrix of the test set. The highdimensional feature matrix reduced to c-1 dimensional subspace, where c is the number of classes.
(5) Defect classification. First, the low-dimensional feature matrix is normalized to [−1, 1]. Then, the low dimensional features and labels data are input into the SVM [26] for training and classification. Finally, the surface defects of the continuous casting slabs are recognized.

Experiments and Discussions
In this section, the authors introduce the sample database in the experiment in subsection 5.1. Some important parameters settings are explained in subsection 5.2. In subsection 5.3, the feature extraction results of DNST are presented and compared with three commonly used multi-scale methods and one texture extraction method. In subsection 5.4, the experimental results of the proposed scheme are presented and compared with other feature combination schemes. The advantages of our proposed scheme in classification time and accuracy are discussed in detail in subsection 5.5. Finally, we analyze the specific classification and visualization results of the proposed schemes.

Experiments and Discussions
In this section, the authors introduce the sample database in the experiment in Section 5.1. Some important parameters settings are explained in Section 5.2. In Section 5.3, the feature extraction results of DNST are presented and compared with three commonly used multi-scale methods and one texture extraction method. In Section 5.4, the experimental results of the proposed scheme are presented and compared with other feature combination schemes. The advantages of our proposed scheme in classification time and accuracy are discussed in detail in Section 5.5. Finally, we analyze the specific classification and visualization results of the proposed schemes.

Sample Database
The samples were collected by an online surface inspection system on a continuous casting slabs production line in a steel plant. The defects database consists of 496 samples, which are divided into two types-positive samples and negative samples. The positive samples have crack defects, with 222 samples. The negative samples include three types of images-scales, lighting variations, and slag marks, with 274 samples. The cracks are defects, and the other three types of samples are pseudo-defects. The pseudo-defects are the main factor of false classification, so the pseudo-defects are labeled as a type of samples. The odd numbers of samples were used for the training set, and the even numbers of samples were used for the test set. All sample images are cropped to 128 by 128 pixels for classification.

Parameter Setting
To test the feature extraction performance of the proposed scheme, the proposed scheme is compared with wavelet, Contourlet, DST, GLCM, etc. Some important parameters are listed as follows.
DNST: The parameters chose [1 2], [0 1 2], etc. Take an example to explain parameters. When the parameter is set to [1 2], the first scale has eight shearlet direction filters, and the second scale has 16 direction filters, which produced a total of 24 high-frequency subbands and one low frequency subband. The number of DNST features is (8 + 16 + 1) × 2 = 50.
GLCM: The gray level of the original image was compressed to 8, 16, 24 levels, etc. The distance parameter was set to 1.
Wavelet: The wavelet type chose "Haar," "db2," etc. The decomposition level was set to 2,3,4. KSR: The kernel type chose "Gaussian," and the kernel parameter was set to 0.001. We chose the radial basis kernel (RBF) as the kernel function of the SVM classifier, and the kernel parameter gamma γ was set to iterate through all values from 0 to 4 with step length 0.01. The other parameters took the default values. Our experiments were based on using a ThinkPad E440 PC equipped with a 2.29 GHz Intel i7 processor and 8GB of RAM. The application software is MATLAB published by MathWork company.

Comparison of DNST Feature
Numerous experiments on each method were carried out and took the best value and average value as the objective basis of the comparison. The experimental results are shown in Figure 4. From Figure 4, the DNST feature achieves the highest classification results. The best accuracy is 89.92%, and the average accuracy is 89.36%. The Contourlet and DST schemes obtained the same recognition result. The average accuracy of DNST is 1.34% higher than that of DST and Contourlet, 3.15% higher than that of wavelet, and 5.97% higher than that of GLCM. This is due to the fact that DNST has excellent spatial localization properties and directional selectivity, and it can capture defects features more accurately for continuous casting slabs. The classification accuracy obtained by GLCM is the lowest, which indicates that only extracting texture information is not sufficient to represent features of the continuous casting slabs. Figure 4 also shows that the four multi-scale multi-directional feature extraction methods are superior to GLCM. Besides, the difference between the best accuracy and the average accuracy of the wavelet is 88.31% − 86.21% = 2.1%, which indicates that the wavelet is less robust to parameter changes. The difference between the best accuracy and average accuracy of DNST is 89.92% − 89.36% = 0.56%, the value is lowest, which shows the DNST has better robustness for parameter changes.

Comparison of Feature Combination
In addition to the classification accuracy evaluation metrics, the other commonly used evaluation metrics include recall, false positive rate (FPR), F-measurement, precision, the area under the receiver operating characteristic (ROC) curve (AUC), etc. [27]. The lower value of the FPR metrics indicates the better feature extraction performance, while the higher value of the other metrics indicates the better feature extraction performance.
In order to verify the superiority of the proposed scheme DNST-GLCM-KSR, we compared it with the Contourlet-KLPP scheme in reference [1], the DST-KLPP scheme in reference [2], Contourlet-GLCM-KLPP in reference [11], as well as with several similar combination feature schemes. Table 1 lists classification results of some evaluation metrics. From Table 1, the DNST-GLCM-KSR scheme achieves the lowest value of FPR, the highest values of precision, F-measure, AUC, and accuracy, indicating that the proposed scheme obtained the best comprehensive performance in the seven schemes. Besides, when only extracting DNST features, using the KLPP algorithm to reduce dimension can achieve better metrics than that of using KSR except AUC metrics. When extracting DNST-GLCM features, using KSR algorithm to reduce dimension can achieve better metrics than that of using KLPP. The above shows that both dimensionality reduction technologies are effective, and which one is better to use depends on experiments. The principle of the two technologies is the same, but the calculation technic is different.
Taking the accuracy metrics as an example, DNST-GLCM-KSR scheme achieved the highest accuracy of 96.37%, which is 2.82% higher than that of reference [1] and 2.02% higher than that of [2] and [11], indicating that the proposed scheme is superior to the traditional ones. When using the same dimensionality reduction technology, the accuracy of Contourlet-KLPP is 93.55%, DST-KLPP is 94.35%, and DNST-KLPP is 95.16%, which indicates that the DNST can extract more discriminant features of continuous casting slabs than that of Contourlet and DST. The above results show that DNST-GLCM-KSR is an excellent feature fusion approach for continuous casting slabs.

Comparison of Feature Combination
In addition to the classification accuracy evaluation metrics, the other commonly used evaluation metrics include recall, false positive rate (FPR), F-measurement, precision, the area under the receiver operating characteristic (ROC) curve (AUC), etc. [27]. The lower value of the FPR metrics indicates the better feature extraction performance, while the higher value of the other metrics indicates the better feature extraction performance.
In order to verify the superiority of the proposed scheme DNST-GLCM-KSR, we compared it with the Contourlet-KLPP scheme in reference [1], the DST-KLPP scheme in reference [2], Contourlet-GLCM-KLPP in reference [11], as well as with several similar combination feature schemes. Table 1 lists classification results of some evaluation metrics. From Table 1, the DNST-GLCM-KSR scheme achieves the lowest value of FPR, the highest values of precision, F-measure, AUC, and accuracy, indicating that the proposed scheme obtained the best comprehensive performance in the seven schemes. Besides, when only extracting DNST features, using the KLPP algorithm to reduce dimension can achieve better metrics than that of using KSR except AUC metrics. When extracting DNST-GLCM features, using KSR algorithm to reduce dimension can achieve better metrics than that of using KLPP. The above shows that both dimensionality reduction technologies are effective, and which one is better to use depends on experiments. The principle of the two technologies is the same, but the calculation technic is different. Taking the accuracy metrics as an example, DNST-GLCM-KSR scheme achieved the highest accuracy of 96.37%, which is 2.82% higher than that of reference [1] and 2.02% higher than that of [2] and [11], indicating that the proposed scheme is superior to the traditional ones. When using the same dimensionality reduction technology, the accuracy of Contourlet-KLPP is 93.55%, DST-KLPP is 94.35%, and DNST-KLPP is 95.16%, which indicates that the DNST can extract more discriminant features of continuous casting slabs than that of Contourlet and DST. The above results show that DNST-GLCM-KSR is an excellent feature fusion approach for continuous casting slabs. Table 2 lists the results of different combined features of DNST. It can be seen that the accuracy of the training set and the test set of DNST-GLCM is 96.77% and 90.73% respectively, the accuracy of the training set and the test set of DNST is 93.55% and 89.92% respectively, and both results of DNST-GLCM are higher than those of DNST, which indicates DNST feature combined with GLCM texture features can improve the recognition accuracy. In addition, the accuracy of DNST-KSR is 94.35% − 89.92% = 4.43% higher than that of DNST, and the accuracy of DNST-GLCM-KSR is 96.37% − 90.73% = 5.64% higher than that of DNST-GLCM. The above shows that KSR can effectively remove redundancy and interference features and improve recognition accuracy. At the same time, we noticed that the classification time was shortened from tens of seconds to several seconds by KSR. This is because KSR reduces feature dimensionality to c − 1, where c is the number of classes. The continuous casting slabs samples include positive samples and negative samples-that is to say, the number of classes is 2. The feature number was reduced to 1 dimensionality. Finally, it should be noted that the number of subbands by DNST decomposition is different when different feature combinations achieve the highest recognition accuracy.  Figure 5 shows the recognition accuracy of the training set and test set when using KSR and not using KSR. With the increase of SVM kernel parameter γ, the accuracy of the DNST-GLCM training set gradually increases and is finally close to 100%, while the accuracy of the test set first increases and then decreases, which indicates that DNST-GLCM feature data is sensitive to SVM kernel parameter; in other words, the feature is complex and not conducive to classifier learning. For DNST-GLCM-KSR, the accuracy of test set and training set are both high, and the curve fluctuation are small. The above results show that the DNST-GLCM-KSR scheme makes the extracted features more discriminative, easier to learn and classify, and it has a strong robustness to the kernel parameter of SVM.

Comparisons of Dimensionality Reduction
Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 13 Table 2 lists the results of different combined features of DNST. It can be seen that the accuracy of the training set and the test set of DNST-GLCM is 96.77% and 90.73% respectively, the accuracy of the training set and the test set of DNST is 93.55% and 89.92% respectively, and both results of DNST-GLCM are higher than those of DNST, which indicates DNST feature combined with GLCM texture features can improve the recognition accuracy. In addition, the accuracy of DNST-KSR is 94.35% -89.92% = 4.43% higher than that of DNST, and the accuracy of DNST-GLCM-KSR is 96.37% − 90.73% = 5.64% higher than that of DNST-GLCM. The above shows that KSR can effectively remove redundancy and interference features and improve recognition accuracy. At the same time, we noticed that the classification time was shortened from tens of seconds to several seconds by KSR. This is because KSR reduces feature dimensionality to c-1 , where c is the number of classes. The continuous casting slabs samples include positive samples and negative samples-that is to say, the number of classes is 2. The feature number was reduced to 1 dimensionality. Finally, it should be noted that the number of subbands by DNST decomposition is different when different feature combinations achieve the highest recognition accuracy.  Figure 5 shows the recognition accuracy of the training set and test set when using KSR and not using KSR. With the increase of SVM kernel parameter , the accuracy of the DNST-GLCM training set gradually increases and is finally close to 100%, while the accuracy of the test set first increases and then decreases, which indicates that DNST-GLCM feature data is sensitive to SVM kernel parameter; in other words, the feature is complex and not conducive to classifier learning. For DNST-GLCM-KSR, the accuracy of test set and training set are both high, and the curve fluctuation are small. The above results show that the DNST-GLCM-KSR scheme makes the extracted features more discriminative, easier to learn and classify, and it has a strong robustness to the kernel parameter of SVM.   Table 3 lists the detailed classification of test set of DNST-GLCM-KSR scheme. The accuracy of positive sample (crack defect) is 95.50%; the accuracy of negative sample is 97.08%. There are four pseudo-defects misclassified as crack defects, and five crack defects are misclassified as pseudo-defects. This was due to the fact that inter-class defects have similar aspects in appearance. The false alarm rate of cracks is 4/110 = 3.64%, and the missing alarm rate is 5/111 = 4.5%. The above metrics meet the needs of engineering application of continuous casting slabs.  Figure 6 shows the visualization of the DNST-GLCM-KSR test set. The result is a straight line, because the number of the feature dimensionality is 1. The pink circle represents the positive sample, the blue one represents the negative sample, and the upper left corner is a local enlargement image. The result of Figure 6 is consistent with that of Table 3, namely that four pseudo-defects are misclassified as crack and five crack defects are misclassified as pseudo-defects. From the graph, we can see that the DNST-GLCM-KSR features truly reflect the similarity between defect images. The intra-class scatter is small, and the inter-class scatter is large. The above analysis shows the method proposed can effectively recognize crack defects of continuous casting slabs in complex background images and interference factors. Table 3 lists the detailed classification of test set of DNST-GLCM-KSR scheme. The accuracy of positive sample (crack defect) is 95.50%; the accuracy of negative sample is 97.08%. There are four pseudo-defects misclassified as crack defects, and five crack defects are misclassified as pseudodefects. This was due to the fact that inter-class defects have similar aspects in appearance. The false alarm rate of cracks is 4/110 = 3.64%, and the missing alarm rate is 5/111 = 4.5%. The above metrics meet the needs of engineering application of continuous casting slabs.  Figure 6 shows the visualization of the DNST-GLCM-KSR test set. The result is a straight line, because the number of the feature dimensionality is 1. The pink circle represents the positive sample, the blue one represents the negative sample, and the upper left corner is a local enlargement image. The result of Figure 6 is consistent with that of Table 3, namely that four pseudo-defects are misclassified as crack and five crack defects are misclassified as pseudo-defects. From the graph, we can see that the DNST-GLCM-KSR features truly reflect the similarity between defect images. The intra-class scatter is small, and the inter-class scatter is large. The above analysis shows the method proposed can effectively recognize crack defects of continuous casting slabs in complex background images and interference factors.

Conclusions
According to the direction and texture information of surface defects of the continuous casting slabs with complex backgrounds, a new feature extraction approach DNST-GLCM-KSR is proposed, which combines multi-scale and multi-directional DNST features with GLCM texture features and uses KSR technology to reduce dimensionality. The experimental results are as follows.
(1) The DNST feature obtained the highest average accuracy and the best accuracy. It can better characterize defects of continuous casting slabs than that of Contourlet, DST, wavelet, and GLCM.
(2) The accuracy of the training set and test set of the DNST-GLCM feature were 96.77% and 90.73%, respectively. Both results were higher than those of DNST feature. The recognition accuracy of continuous casting slabs can be improved by combining the features of DNST and GLCM. Zoom Figure 6. Visualization results.

Conclusions
According to the direction and texture information of surface defects of the continuous casting slabs with complex backgrounds, a new feature extraction approach DNST-GLCM-KSR is proposed, which combines multi-scale and multi-directional DNST features with GLCM texture features and uses KSR technology to reduce dimensionality. The experimental results are as follows.
(1) The DNST feature obtained the highest average accuracy and the best accuracy. It can better characterize defects of continuous casting slabs than that of Contourlet, DST, wavelet, and GLCM.
(2) The accuracy of the training set and test set of the DNST-GLCM feature were 96.77% and 90.73%, respectively. Both results were higher than those of DNST feature. The recognition accuracy of continuous casting slabs can be improved by combining the features of DNST and GLCM. (3) The recognition accuracy of the DNST-GLCM-KSR scheme is 5.64% higher than that of DNST-GLCM, and the classification time of DNST-GLCM-KSR was shorter than that of DNST-GLCM. Using KSR technology can improve recognition accuracy and shorten classification time. (4) The proposed scheme can extract more discriminative features of defects and make the recognition accuracy of crack defect up to 95.50% and the total accuracy up to 96.37%. The new scheme provides a new method for the surface defect recognition of continuous casting slabs. (5) Future work should collect more defect samples, establish a complete sample database, and improve the recognition accuracy of crack defect.
Author Contributions: K.X. contributed to the methodology of the study and investigation. X.L. contributed significantly to formal analysis and manuscript writing. P.Z. and H.L. helped perform the analysis with constructive discussions.