A Systematic Review of Machine-Vision-Based Leather Surface Defect Inspection

: Machine-vision-based surface defect inspection is one of the key technologies to realize intelligent manufacturing. This paper provides a systematic review on leather surface defect in-spections based on machine vision. Leather products are regarded as the most traded products all over the world. Automatic detection, location, and recognition of leather surface defects are very important for the intelligent manufacturing of leather products, and are challenging but noteworthy tasks. This work investigates a large amount of literature related to leather surface defect inspection. In addition, we also investigate and evaluate the performance of some edge detectors and threshold detectors for leather defect detection, and the identification accuracy of the classical machine learning method SVM for leather surface defect identification. A detailed and methodical review of leather surface defect inspection with image analysis and machine learning is presented. Main challenges and future development trends are discussed for leather surface defect inspection, which can be used as a source of guidelines for designing and developing new solutions in this field.


Introduction
Leather and its products are regarded as the most traded products all over the world, with an annual international trade of more than USD 80 billion [1]. To produce leather products with novel design and comfort, the choice of leather has become the key factor to determine the success or failure of manufacturers. This inspection process mainly includes leather defect detection, location, identification, unavailable area division, and quality grade determination. Reliable and effective inspection including detection and classification of leather surface defects is very important for the leather industry with leather as the main raw material, such as leather footwear and handbag manufacturers [2]. The traditional detection and classification of leather surface defects are performed by human inspectors who tend to miss considerable numbers of defects because human beings are basically inconsistent and ill-suited for such simple and repetitive tasks [3]. Furthermore, manual inspections are slow and labor-intensive tasks. These factors have become bottlenecks restricting the leather industry [4].
In the past decades, amazing progress has been made in applying intelligent systems to solve practical problems in the fields of medicine, telecommunications, finance, medical diagnosis, transportation, information retrieval, energy, and so on [5]. The requirements of automation have revolutionized the production mode of the manufacturing industry. From resource optimization to industrial inspection, experts and intelligent systems have been applied in almost all types of industrial processing. Automatic defect inspection of industrial products is one of the important application scenarios of such intelligent systems, and it is also one of the key technologies to realize intelligent manufacturing [6]. Some research has been carried out on automated inspection of metal surfaces [7], textile fabrics [8][9][10], structural health monitoring, and so on [11][12][13]. With the rapid development of intelligent manufacturing, leather product manufacturing has also entered a new stage of development [3][4][5].
Since the 1990s, some scholars and suppliers of automatic inspection equipment have begun to pay common attention to the automatic inspection of leather surface defects. However, we investigated relevant enterprises in developed areas of leather products such as Guangdong and Zhejiang provinces in China (the highest producer, importer, and exporter of leather products around the world [1]), and found that many enterprises still maintain the traditional manual defect inspection for the leather. Some enterprises have realized semi-automatic and semi-manual defect inspection, and a real fully automatic defect inspection system has not been realized. Relatively few works have been conducted on automated leather surface defect inspection, mainly because of the difficult nature of the problem [3]. It is very difficult to construct exact inspection models because their appearance and size greatly vary [3][4][5]. It is almost impossible to find two defects with the same shape and size, even if they belong to the same defect class [3]. Automatic detection, location, and recognition of leather surface defects are interesting but challenging problems. It is expected that the automatic leather defect inspection system will make rapid progress shortly.
In this work, we systematically reviewed a large amount of literature over the past three decades, and provided an extensive overview of the research on automatic detection and recognition of leather defects based on image processing and machine learning. In doing so, we investigated and evaluated the performance of some edge detectors and threshold detectors for leather defect detection, as well as the accuracy of the SVM-based leather surface defect identification, and we strive to provide a clear direction for researchers and engineers to select, design, or implement the architecture of visual detection and recognition of leather surface defects.

Vision-Based Leather Surface Defect Inspection System
The requirements for leather surface defect inspection can be divided into three different levels: "what is the defect" (classification), "where is the defect" (location), and "what is the defect shape and how large is the area" (segmentation). The inspection technology of leather surface defects is mainly based on machine vision inspection methods [14].
As shown in Figure 1, similar to other visual surface defect inspection systems, the basic components of a machine vision system for leather defect automatic inspection include leather surface image acquisition, image processing, image analysis, data management, and human-machine interface [2]. Based on the defect location, shape, and area detected by the defect detection module, as well as the defect type detected by the defect identification module, combined with the location and various contextual characteristics, the applications of automatic grading of leather quality and intelligent layout of leather are realized with the assistance of the leather quality expert system. Stable, reliable, and effective automatic detection and recognition of leather surface defects are the key techniques to realize intelligent manufacturing of leather products.
In the last decade, many machine-vision-based techniques were developed in surface defect inspection, not limited to the leather surface. These methods can be mainly divided into two categories, namely, the traditional image processing method and the machine learning method, which is based on handcrafted features or shallow learning techniques. Machine-learning-based methods generally include two stages of feature extraction and pattern classification. By analyzing the characteristics of the input image, the feature vector describing the defect information is designed, and then the feature vector is put into a classifier model that is trained in advance to determine whether the input image has a defect or not. In recent years, deep neural network methods have achieved excellent results in many computer vision applications, such as natural scene classification, face recognition, fault diagnosis, target tracking, etc.  This review focuses on the application of the above methods in the field of leather surface defects. Taking "leather defect detection", "leather defect identification", "leather surface", and "defect inspection" as keywords, we retrieved more than 65 English documents and more than 20 Chinese documents in Science Direct, IEEE Explore, and CNKI databases since 1990. Figure 2 presents the methods of leather surface defect inspection used in these literatures. In the next few sections, we will analyze and compare the relevant technologies and their applications in this field.

Image Acquisition
A leather surface image embraces three characteristics [15]: (i) large imaging area, i.e., a whole skin area can reach 2 × 3 m; (ii) small defect size, i.e., the defect area can be small, 150 μm × 150 μm, the maximum average diameter of the thin spots is about 0.98 mm, the minimum average circular spot diameter is about 1.20 mm; and (iii) the leather surface belongs to the texture surface, and the defects are usually hidden in the irregular texture background of the leather surface. Therefore, leather image acquisition requires a large camera view and high resolution. The key factors affecting the leather surface acquisition of defective images are the camera and illuminant.

Camera
In actual production, the leather is usually uniformly moved in a single direction to a designated location before processing. Therefore, online inspection often adopts the line-scan camera for image acquisition. At present, the online surface defect inspection for leather and other flat, wide, and continuous products mainly adopts line-scan mode, which can detect most of the defects. However, some leather surface defects such as stamping and ink are anisotropy. If similar anisotropic defect images were collected in a line-scan camera alone, the defect leak detection rate reached 22% [16].
The aerial camera can obtain two-dimensional information and intuitively measure images. Hence, many researchers chose to use the traditional CCD camera for collecting leather images. To avoid the small field of view weaknesses and obtain high-resolution imaging, one scheme is to move the CCD camera through a complex control system and scan the effective area of the whole leather; another alternative one is to take multiple camera imaging. Both of these schemes require the application of image fusion to obtain the entire leather image. He et al. [17] proposed the image splicing technology based on Gabor Zernike moments of geometric summary triangle texture block. They tried to solve the problems of image mosaicking algorithm complex and slow speed, and to realize the rapid and accurate splicing of sequence images in large-area leather visual inspection. Ho et al. [18] presented a real-time image capturing system using four cameras at 30 fps and stitched their views together to create a panoramic video with a resolution of 1280 × 960 pix. However, these image acquisitions based on image fusion would increase the complexity of the image processing algorithm and require a complex control system.
With the development of the ultra-high definition (UHD) CCD aerial camera technology, UHD-based whole leather imaging technology has emerged. Deng et al. [19] tried to use an ultra-high CCD aerial camera to image the whole leather at one time. The system has the characteristics of fast imaging, simple imaging process, no multi-view image fusion, and good imaging effect. Due to the cost reduction of the ultra-high definition CCD camera, it will become the main method of leather image acquisition. However, it is still necessary to solve the problems of uneven lighting and the overlapping of leather edge shadow and background [20].
Chen et al. [5] carried out a pilot research study in which they used hyperspectral imaging (HSI) to implement surface inspection in pixel level detection, which employed the spectral information of leather defects instead of the spatial information processing techniques to effectively identify leather defects. Hyperspectral image has become an emerging technology and has been extensively used in the domains of geology, agriculture, global change, and national defense, with highly promising industrial potential [5]. Since hyperspectral data volume is very large, high data storage capacity is required, and reducing data volume is also a topic worth exploring. Their work [5] is a pilot study and guideline for HSI in the detection of wet blue leather to design appropriate algorithms.

Illuminant
The light source and its illumination mode will directly affect the image acquisition quality and inspection efficiency. The illumination uniformity and brightness of the tar-get surface are important indicators of the light source. Due to the influence of texture, conventional lighting methods find it difficult to accurately identify the printing and dyeing or indentation defects of leather with texture structure. Fan et al. [20] found that the brightness was different where there was a different distance between the imaging plane and the light source, resulted in uneven illumination.
In leather defect inspection, common lighting sources include high-frequency fluorescent lamp, energy-saving lamp, and LED array lamp. High-frequency fluorescent lamps and energy-saving lamps are suitable for large-area lighting with relatively poor uniformity. LED light source has high luminous efficiency and good stability, especially the small luminous surface, which makes it easy to carry out secondary optical design. At present, the method of uniform lighting using LED is mainly LED with array distribution, whose uniformity can reach more than 90%. Ring LED, plane and strip light source, and arch light source can well realize uniform and high illumination, but they all belong to the coaxial lighting system, that is, the illumination light is generally symmetrically distributed. Wang et al. [21] suggested that the printing and dyeing defects of some textured leather can be highlighted only through unilateral asymmetric uniform lighting, i.e., off-axis lighting, and they designed a set of off-axis LED curved surface array lighting for leather defect inspection, which provides a new idea for improving the image acquisition quality of leather surface defects. Unfortunately, in most of the literature on leather surface defect inspection, the lighting design of the collected image was not described in detail.

Traditional Image-Processing-Based Leather Visual Inspection
As shown in Figures 1 and 2, the early leather visual inspection technology was mainly based on traditional image processing methods. These methods use the primitive attributes reflected by local anomalies to detect and segment defects, which can be further divided into the structural method [22][23][24], threshold method, spectral method, texture analysis method [25], and some other segmentation methods based on specific theories (such as fuzzy clustering method [26], saliency method), etc. These methods have been applied to leather surface defect inspection in different scenarios.

Structure Method
The structural method includes edge and morphological operations. Edge detection is a commonly used image segmentation technique, using a series of mathematical methods to determine the presence of edges or lines (formally known as discontinuities) and to outline them in digital images in an appropriate manner. In the early 1990s, Limas-Serafim [22][23][24] applied the multi-resolution pyramid algorithm to segment leather defects, and the main idea is to enhance the edges of the object through a multi-resolution method and eliminate most of the edges based on the background texture. Limas-Serafim et al. [23] built three pyramids to divide into characterization images. The first pyramid was constructed based on the mean of the two highest values in the neighborhood. The second pyramid had a RosenfeldS cone with 16 directions from the first pyramid. The third pyramid was built with a small number of edges, but it had to satisfy certain directional consistency and strength advantages. Defect segmentation was performed by connecting the nodes of the edge pyramid, and an edge-weighted function was defined for linking the nodes with different resolutions. These edges can be linked if the edges at different resolutions of the image belong to the same object. They can be rejected if they belong to a random background. The algorithm was applied to calf leather defects (segmentation of calfskin venules and scar defects caused by animal disease). In this application scenario, neither the threshold-based nor the ordinary edge segmentation algorithm can successfully segment the leather defect. Limas-Serafim [22][23][24] simply verified the proposed method, promising to reconstruct the boundaries of the object, but did not make a thorough and detailed evaluation of its effectiveness.
In the field of leather defect detection, Kasi et al. [24] evaluated conventional edge detectors such as Sobel, Canny, Prewitt, and Roberts et al. In these conventional methods, the detected edges are more often false ones, and find it difficult to meet the actual needs. The Sobel operator provides relatively better output, but it cannot provide any clear or well-defined edges for a given input image and is still not suitable for leather samples. Kasi et al. [24] presented a technique for identifying the defects in leather by using an auto-adaptive edge detection algorithm. Here, the edges were detected by using the Sobel operator. The maximum and minimum values of the absolute gradient are taken as the thresholding conditions. If the threshold is above the actual value, the edges are maximum, and if the threshold is below it, there are no edges. Finally, the edges were refined to obtain clear, continuous image edges. During this refinement, interpolation was used to obtain the local maxima. The adaptive edge detection algorithm for leather images helps find clear and continuous edges. The algorithm has detected hundreds of leather surface image defects, and the detected leather edges are clear and continuous compared to the traditional edge detector. Only the edge detection of a kind of defect was shown; again, the detection method lacks broader validation.
Liong et al. [14] utilized edge detectors and statistical approaches as feature extractors and obtained a classification accuracy rate of 84% from a sample of approximately 2500 pieces of 400 × 400 leather patches. Qingyuan et al. [27], Popov et al. [28], Lovergine et al. [29], and Kwak et al. [3] applied morphological operations to leather defect inspection, which were often combined with other graphic segmentation algorithms.
In this work, we evaluated Sobel, Canny, Prewitt, and Roberts detectors combined with morphological operations for the inspection of four kinds of leather defects (scratch, rotten surface, holes, and needle eye) as shown in Figure 3. The code is implemented by using Halcon toolkit, which is a famous machine vision software development kit in the industrial field. The detected results using four kinds of edge detectors are shown in Table 1, where each kind of detection has 20 pieces of images. As shown in Table 1, the edge detectors with morphological operation cannot detect leather defects very well. Among the four defects, only holes can be completely detected, and the success rate is between 60 and 75%. For the other three defects, only part of the defect information can be detected from the image. Therefore, we can draw a conclusion that the traditional edge detection algorithm can only be used for leather surface defect detection with few challenges.

Threshold Method
Threshold-based segmentation has been extensively used as a tool for image segmentation. The method is based on the assumption that defects in the image and background (normal leather fabric) pixels can be distinguished by their grayscale values. Since the grayscale values of pixels belonging to the defect region are most likely to be darker or brighter than the background, it can be possible to separate the defect from fine leather by using thresholding techniques. Theoretically, since the defective objects are generally darker and/or brighter than the background, the distribution density function of the pixel grayscale values for leather surface images can be approximately expressed as a combination of three normal distributions, given by [30]: where > > , and ( , ) are the mean and variances for the background, ( , ) and ( , ) are the mean and variances for the darker and brighter part of the defects, respectively. However, owing to the small population of defects, the part of the distributions in the histogram reflecting the defects is not significant enough to form independent peaks [30]. The threshold methods include Otsu method [31], histogram method [3,32], quadtree decomposition [33], etc. Otsu method is the optimal threshold method based on discriminant analysis. Yeh et al. [31] were involved in establishing a leather trading compensation standard by using the Otsu method to detect defects. However, the Otsu method may crash when the proportion of background pixels and defective objects in an image is too large [31]. So, the Otsu method is not suitable for leather surface defect inspection.
Most studies on automatic threshold methods involve bimodal or multimodal distribution histograms. In practice, the global information cannot accurately describe the local region because of uneven illumination and color changes on the leather surface. The small neighborhoods of the pixels of interest are usually considered. However, due to the small proportion of defect regions to the entire leather surface, most of the histograms of the small sub-images remain unimodal even though these small neighborhoods contain defects. Some thresholding methods take advantage of the fact that the histograms in many sub-images become bimodal or multimodal for leather defect segmentation [3].
The grayscale distributions of leather surface defects and noise often overlap, and the only two distinct differences between noise and defects are their density and size. This complicates the separation of defects from noise using only traditional histogram-based threshold methods (such as fixed or adaptive thresholds). Since a single histogram-based threshold technique could not meet the requirements of leather defect inspection, Kwak et al. [3] use a two-step segmentation procedure for inspection based on thresholding and morphological processing. After thresholding the gray level image, the resulting binary image is processed by a combination of binary morphological erosion and dilation operations along with median filters to remove noise and fill the holes in detected defects. A binary connected component analysis is then applied to the processed binary image.
Histogram-based image analysis remains unchanged in image rotation and scaling, with the advantages of little influence on perspective and fast information processing, but the classification may go wrong due to an absence of information for the spatial color distribution. There are many classification criteria-χ2 test, histogram intersection, cor-relation coefficients, Kolmogorov-Smirnov's distance, divergence, etc. Georgieva et al. [32] discussed the application of χ2 criteria for image analysis of leather surfaces and obtaining their standard histograms and thought that one of the most applicable criteria for the large image sizes is χ2 criterion.
Krastev et al. [33] investigated 12 histogram and statistical features and quadtree decomposition for analysis of leather surface images. They used a technique that partitions an image into homogeneous blocks. This method holds the possibility of investigating the changes of the feature values depending on the area size. The quadtree decomposition is a suitable method for fast localizing defective regions, but the additional local analysis is needed for the exact defect contour determination. Bigger features value difference is obtained with a bigger proportion of defective/non-defective pixels in the examined area. The most appropriate feature sets for leather surface defect inspection are histogram ends (left and right border), median, and mean values.
As color is an important attribute for visual recognition of discrimination, and also the leathers have different colors, thus Kumar et al. [34] presented a color-based thresholding segmentation approach for leather defect identification using a multi-level thresholding function with a given range of color features. In the presented work, the specific range of values for the color attributes is identified using the color histogram to detect the different leather defects, which could efficiently detect several types of defects such as a chick wire, heavy grain, and folding marks by using specific thresholds for the automated real-time inspection of leather defects.
In this work, we evaluated local threshold and Otsu method combined with morphological operations for the inspection of four kinds of leather defects as shown in Figure 3. The code is implemented by using Halcon toolkit. The detected results using two kinds of threshold detectors are shown in Table 2. As shown in Table 2, the two kinds of threshold detection methods are not good for leather defect detection, or are even worse than the previous edge detection.

Texture Method
Most natural surfaces have rich textural content, and these background macrotextures can be fine and convex, producing many edges that are as valuable as the edges of other objects. Some machine vision systems often require defect inspection from the perspective of texture analysis. In each point of an image with a directional texture, the directional vector field can be evaluated as a 2D vector whose direction corresponds to the main local direction of the gradient and a length proportional to its consistency (isotropic degree).
Some literature [29,35,36] separated defects from a complex nonhomogeneous background by analyzing the leather texture properties and their strongly oriented structure. The patterns to be analyzed were represented in an appropriate parameter space using a neural network [29]; in this way, a parameter vector is associated with each different textured region in the original image. Finally, a filter process, based on knowledge about the parameter vectors representing the leather without defects, detected and classified any abnormality [29]. In the literature [35], Branca et al. developed an algorithm that removes textural background by discriminating the signal singularities through an analysis of wavelet transform maxima indicating the location of edges in images. The presented work [35] integrated an oriented singularity detection framework based on wavelet theory analyzing compositional textures through the vector fields of dominant local gradient orientations. Lovergine et al. [36] presented some results obtained using a defects detector based on oriented texture analysis, which reveals itself to be useful for a few classes of leather defects, such as scars or folds. These kinds of defects can be detected by using a black and white camera running over the leather patch and by classifying textures based on their gradient orientations and local coherence. A morphological segmentation procedure was applied to the regularized oriented texture field to extract probable defective areas. In addition, literature [27] and [37] also utilize the texture properties of leather for leather defect inspection, the former combining mathematical morphology and the latter combining the edge detector with a texture analysis method to extract defects.
The work of Branca et al., which has demonstrated the effectiveness of defect inspection methods based on leather texture analysis, but with a somewhat high computational cost and poor interference resistance, is not suitable for minor defect inspection. Extensive texture analysis may lead to computation being expensive and may fail to meet production requirements. Furthermore, some defects may be too subtle to strongly influence the parameter of the statistical model [30].

Spectral Method
The spectral methods commonly include Fourier transform, wavelet transform, and Gabor transform. Texture image has a certain periodicity in spatial distribution, and its power spectrum has discreteness and regularity. For directional texture, the directivity will be well maintained in the Fourier spectrum. For random textures, the response distribution of the spectrum is not limited to some specific directions [38]. As a global transform, Fourier transform can well reflect the integrity of the signal, but it is not sensitive to the local frequency domain. It is more suitable for detecting global and single defects, and it finds it difficult to detect small or multi-defect leather images [39].
Gabor transform is one of the short-time Fourier transforms. A Gaussian window function is added to extract the local information of the image, which overcomes the disadvantage that Fourier transform cannot be analyzed locally. This is a multi-scale analysis method in which the time-frequency window can be adjusted and the window changes with the frequency domain. It can provide good direction, and scale selection characteristics are insensitive to illumination changes, thus it is suitable for texture analysis. The advantage of this transformation is that it has a good effect on texture description, and can be applied to structural texture and statistical texture. The disadvantage is that it is necessary to obtain defect-free samples in advance and obtain the optimal parameters, which have poor portability and robustness. Gabor transform is mainly used to detect defects with large size, but it is powerless for small-size defects and complex random texture image segmentation [38,40]. Yin et al. [39] proposed a leather defect inspection algorithm based on wavelet transform with Gabor function as the basis function based on the multi-directional characteristics of Gabor function and the multi-resolution of wavelet transform.
In wavelet transform, the frequency components of the image are organized such that the lower and higher frequencies are separated, which also gives the image variations at different scales because of its multi-resolution analysis and hence makes wavelet transform more suitable for leather defect inspection [41]. Sobral et al. [42] presented a methodology based on the wavelet transform to detect leather defects, where the undecimated Haar wavelet and eight optimized filters were used. The methodology used a bank of optimized filters, where each filter is tuned to one defect type. Filter shape and wavelet sub-band were selected based on the maximization of the ratio between feature values on defect regions and on normal regions. The morphology was evaluated using a database of about 150 samples. The author claimed that the method was able to achieve the same recognition rate as an experienced human operator. Adamo et al. [43] presented a two-dimensional wavelet-based denoising technique of high-resolution leather images. This method produced a suitable number of decomposition levels of the image, and carried out a thresholding operation on details, and finally, using the threshold levels, produced an estimate considering the actual noise level. He et al. [44] developed a wavelet band selection procedure to automatically determine the number of resolution levels and decompose sub-images for the best discrimination of defects and removal of repetitive texture patterns in the image. Adaptive binary thresholding was then used to separate the defective regions from the uniform gray-level background in the restored image. The methodology does not rely on textural features to detect local anomalies and alleviates all limitations of feature-extraction methods. With proper selection of a smooth sub-image or the combination of detailed sub-images at different multi-resolution levels for image reconstruction, the global repetitive texture pattern can be efficiently removed and only local anomalies are preserved in the restored image.

Clustering Method
Leather surface defects can also be viewed as textured images spatially composed of some collection of local irregular points, so defect detection can also be seen as a clustering process. The most widely used in practice is the Fuzzy C-Means (FCM) algorithm.
Based on Particle Swarm Optimization (PSO) and fuzzy clustering algorithms, He et al. [17] proposed a leather surface defect detection method. This method makes full use of the advantages of global optimization and rapid convergence of PSO, quickly finds the attribution of sample points, and combines the fuzzy clustering algorithm to cluster the leather surface texture information. The methodology was validated by using a 2000 × 1500 pixel leather defect image for defect segmentation, which is superior to the conventional methods such as Sobel, Canny, Prewitt, and Roberts edge detection. However, the generalization and stability of the above methodology require more validation. Cui [45] applied a fuzzy clustering algorithm to realize the automatic detection of defects and automatically determine the optimal cluster number. It is based on the leather image characteristics of the average of the five measures calculated from the symbiosis directions as the texture feature vector in the center of the neighborhood. However, only a 256 × 256 grayscale leather image was used to verify its effectiveness. Although the reported experimental results are valid, the methodology also lacks generalization. In the experimental verification of FCM-based defects detection for the leather unhealed scar and concave, Yan [46] found that the detection accuracy was seriously affected by the texture interference, and the subsequent post-processing could not separate the defects, and the defects were submerged in the texture interference. Based on the work of Cui [45] et al., Chen [47] further evaluated the improved FCM algorithm. After image segmentation, the difference between the defect regions and the non-defect regions becomes bigger, but the final result of separating defects cannot be achieved. The defect regions are somewhat disconnected, which may bring less noise in the process of segmentation.

Visual Salient Method
Image saliency object detection mainly focuses on the prominence of the whole image, the goal of which is to uniformly highlight the object area that can attract visual attention in the image, suppress the background area that cannot attract visual attention, and require the detected object to have clear boundaries; it is widely used in computer vision fields such as image segmentation [48].
Zhu et al. [49] segmented the leather surface defects based on a visual salient map that is fused by extracting the color and brightness salient features of leather images, respectively. The methodology has a good inspection effect for defects with clear boundaries, abrasions, healing and digging, insect spots, and small area, and its performance is better than FCM and threshold-based inspection. For the scattered defects such as unhealed knife wounds, its performance is slightly worse, which is mainly due to the differences in their internal saliency that results in more superficial defects that cannot be highlighted. Although this inspection method is not disturbed by texture and can realize the rapid and effective inspection of texture images, leather is susceptible to the influence of factors such as light source strength and color temperature, and this method cannot meet the versatility of leather defect inspection. In addition, the edge of the defects cannot be well identified, especially for the more scattered defects.
Liu et al. [50] proposed a leather defect detection system based on photometric stereo vision and image saliency. The photometric stereo technology was used to realize image enhancement, which effectively avoids the defect that leather is easy to be affected by light due to different colors and textures. At the same time, the image spectral residual algorithm effectively removes the influence of background information, which makes up for the disadvantage that the traditional saliency target inspection algorithm cannot effectively extract the foreground. In the leather surface scratch, hole, fold, and chromatic difference defect inspection, the accuracy rate reached 96.84%. The algorithm proposed by Liu [50] has a certain robustness, versatility, and noise resistance.
Ding et al. [51] quantitatively classified leather defects by statistical analysis of geometry and grayscale to obtain salient features of each defect. Then, the salient features are combined with those extracted by convolutional neural network for defect inspection, where the features extracted by the convolutional neural network are dominant, which improves the accuracy of defect inspection by using convolutional neural networks.

Heuristic-Algorithm-Based Defect Segmentation
As an alternative to texture analysis, histogram thresholding, clustering, and so on, various biologically inspired algorithms were explored in image segmentation. Jamadar et al. [52] developed a fast convergence Particle Swarm Optimization algorithm (FCPSO) for segmenting defective regions in complex leather images. The Particle Swarm Optimization (PSO) is a heuristic algorithm loosely inspired by birds flocking in search of food. Compared with conventional PSO and other PSO variants, the above algorithm was found to be efficient for various leather defect images. Gray level co-occurrence matrix (GLCM) texture features from the segmented leather were extracted as input to different supervised classifiers, namely, Neural Network, Decision Tree, Support Vector Machine, Naïve Bayes, k Nearest Neighbor, and Random Forest. FCPSO along with the Random Forest algorithm using optimum feature set had good discrimination between defective and non-defective leather.

Summary of This Section
Traditional image processing methods often need multiple thresholds aiming at various defects in the algorithms. They are very sensitive to lighting conditions and background colors. When a new problem arises, those thresholds need to be adjusted, or it may even be necessary to redesign the algorithms [6]. Wavelet transform, mathematical morphology, Gabor filtering, fuzzy clustering, edge detection, threshold-based segmentation, and other conventional image processing methods have been applied to leather surface defect inspection. This shows some effectiveness in the reported datasets. However, there are few examples of literature related to these studies, the research is not deep enough, the test datasets are relatively small, the diversity of defects is insufficient, and the dynamic change of leather defects is not considered, so it is difficult to ensure the generalization performance of these algorithms. In addition to the lack of a suitable benchmark, another problem that hinders the thorough comparative evaluation of leather defect inspection methods is the lack of publicly available software/code against the reported methods [2].

Machine-Learning-Based Methods
In recent years, many defect inspection tasks could be solved by designing a set of features for a certain defect and providing these features to a simple classifier; these methods are also called knowledge-based approaches [8]. In this section, we will investigate these machine learning methods based on handcrafted features or shallow learning techniques for leather surface defect inspection. Machine-learning-based methods generally include two stages of feature extraction and pattern classification.

Feature Extraction of Leather Defect
The features of leather surface defect can be divided into statistical features, spectral features, structural texture features, shape features, color features, and so on. These characteristics of color, texture, and defect shape are widely used to identify the leather image to realize defect inspection [51]. As shown in Table 3, the most used features are statistical features and color features.

(1) Statistical features
Leather inspection is considered to be a very complex problem in the field of texture classification. Like most natural textures, the eigenvalues change greatly and it is easy to form a pseudo-random structure, but it still follows the law of statistical distribution. Statistical methods can be used to analyze the distribution of textures. In the texture feature extraction of leather images, the widely used statistical features of texture mainly include histogram feature and gray level co-occurrence matrix (GLCM) feature.
The histogram of an image is used to represent the distribution of pixel values of the image, which provides much information about the image. Histogram features include maximum, minimum, mean, median, value range, entropy, variance, and entropy. These histogram features are simple to calculate, insensitive to the spatial distribution of color pixels, and have the advantages of translation and rotation invariance. So, it has been widely used in the field of surface defect inspection [38].
Gray level co-occurrence matrix is a commonly and widely used technique in texture analysis. Since the texture is formed by the repeated occurrence of gray distribution in the spatial position, there will be a certain gray relationship between two pixels separated by a certain distance in the image space, that is, the spatial correlation of gray in the image. GLCM describes the spatial correlation characteristics of the gray level. Several GLCMs must be constructed for each sliding window that scans the image during segmentation. Each GLCM has an associated angle and displacement, related to the direction and frequency that will be represented by this GLCM. The most successful and highly used handcrafted texture features in the literature are Haralick features [52] derived from GLCM. Based on GLCM, Haralick calculated 14 statistics features [51]: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure, and maximum correlation coefficient. These statistics features fit well to capture the spatial correlation of gray level values that contribute to texture perception. The commonly used feature quantities are contrast, correlation, energy, entropy, and autocorrelation.

(2) Color features
Color is an important parameter of image external features. Color features are insensitive to the image change of rotation, translation, and scale. Color models mainly include HSV, RGB, HSI, etc. Common color features include color histogram, color set, color moment, and color aggregation vector.
Bong et al.
[53] divided the leather RGB image into three color channels (red, green, and blue), calculated the average, standard deviation, and skewness value in each color channel, and then converted the RGB image into a gray image to obtain the gray moment feature. Finally, the color moment and gray moment of each color channel were com-bined to form the color moment of the image. At the same time, the color core image features in the gray image were extracted as a part of the feature set [54-57]. Amorim et al. [57] extracted the average value of each color component of HSB and RGB and the 3D histogram value of HSB and RGB color space as part of the leather surface defect feature set.

(3) Spectral features
Filter transformation transforms the image from the spatial domain to the frequency domain or time-frequency domain. Fourier transform, wavelet transform, and Gabor transform are commonly used. Fourier transform transforms the image into a frequency domain and uses spectral energy or spectral entropy to express texture. Periodicity, directionality, and randomness are the three important factors to characterize texture [54]. The output of the Gabor filter can be used as a texture feature, but the dimension is high. To reduce the amount of data in the feature set, post-processing methods such as smoothing, Gabor energy feature, complex moment feature, and independent component analysis are often used for the output of the Gabor filter. Wavelet transform organizes the frequency components of the image and separates the low frequency from the high frequency. Due to the multi-resolution analysis of wavelet transform, the extract features change at different scales. A series of high-frequency sub-band images representing different direction information constitutes images with different resolutions. High-frequency sub-band images reflect the texture characteristics of the image. Therefore, wavelet transform is suitable for leather defect recognition. The traditional pyramid wavelet transforms only decompose the low-frequency part, while the high-frequency part of the texture image may also contain important feature information. Wavelet packet decomposition or tree structure wavelet decomposition can overcome this disadvantage. The wavelet transform method has been widely used to extract image features for surface defect inspection [38]. Jawahar et al. [41] used wavelet transform to extract wavelet statistical features and wavelet co-occurrence matrix features from leather images, such as entropy, energy, contrast, correlation, clustering significance, standard deviation, mean value, and local uniformity, which were used as the input of classifier. Sobral et al. [42] extracted texture features using Hal wavelet transform and eight optimized filters to obtain the same recognition rate as an experienced human operator.

(4) Structural texture features
The structural analysis method realizes oriented textures analysis according to the characteristics of texture periodicity and spatial geometry [38]. Generally speaking, the defects on the leather surface are characterized by a specific orientation structure, which can be represented by the orientation field. The orientation field of an image comprises the angle image and the coherence image. The former (representing the dominant local orientation) is computed over a neighborhood of each point from the orientations of gradients evaluated on the original image smoothed using a Gaussian filter. With exp as the polar representation of the gradient vector at the point (i, j), the main gradient direction generally at ter (m, n) with × neighborhood can be estimated as Equation (2), and the dominant local direction is given by The commonly used structural analysis methods also include morphology, graph theory, topology, and so on. Literature [27,28] applied mathematical morphology to analyze the texture features of complex structures. Popov et al. [27] extracted local fractal features of a series of scales based on mathematical morphology for texture classification of brushed leather surfaces. Qing et al. [27] also proposed a texture classification method based on mathematical morphology. The global features were supplemented by local features for the classification of leather made of the same material. Branca et al. [29,35,36] used the structure method to extract the edge features of the image for leather surface defect inspection. By analyzing the oriented structure of the defect, the defect was separated from the complex non-uniform background.

(5) Shape features
In terms of geometry, leather defects can be divided into three types: point, line, and surface. Each type of defect is divided into different categories according to geometry shape. Some defects can be distinguished from other defects by four characteristics: roundness, area, linearity, and width [51]. Among them, roundness and area can be used as the salient feature of black spots and rotten surfaces. Linearity and width can be used as salient characteristics of scratches, necklines, and blood tendons. The area of surface defects such as branding is much larger than that of other surface defects, so the area can be used as the salient feature of branding. Point defects have high roundness and small area, while linear defects have the characteristics of small width and high linearity. Ding et al. [51] produced mathematical statistics on the geometric and gray features of defects, summarized the salient features of leather defects, and proposed an inspection method combining convolution neural network and salient features to detect leather defects.

Viana et al. [55] used interaction maps [56]
as the feature descriptor for leather defect identification, which combine with gray co-occurrence matrices, RGB, and the HSB color space to extract texture and color features from a given set of raw hide leather images. The term "interaction map" was originally introduced by Gimel'farb in his Markov Gibbs texture model with pairwise pixel interactions [56]; it refers to the structure of the statistical pairwise pixel interactions evaluated through the spatial dependence of a feature of the extended gray-level difference histogram (GLDH). The basic assumptions of the feature-based interaction map approach are as follows: (1) Pairwise pixel interactions carry important structural information. (2) Both short-and long-range interactions are relevant. (3) Fine angular resolution is essential. (4) Structural information can be obtained through EGLDH features. This can be achieved more efficiently by analyzing the spatial dependence of the features than by selecting the "optimal" features for a limited number of pre-set spacing. (5) Texture orientation can be defined by the axes of maximum statistical symmetry [56].

Feature Selection
Feature extraction of leather surface images implements a transformation from image space to feature space, but not all features are useful for subsequent defect identification. If the number of features extracted is large, there is likely to be redundant information in these features, which is not only unable to improve the inspection accuracy, but also to enhance the complexity of the image processing algorithm. The purpose of feature selection is to find out the truly useful features from the original image features, reduce the algorithm complexity, and improve the accuracy of classification and identification. Commonly used feature selection methods include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Fisher Linear Discriminant Analysis (FLDA), Correlation-Based Feature Selection (CFS), Evolutionary algorithm, and popular non-linear dimensionality reduction methods, and so on [38].
Amorim et al. [57] evaluated five FLDA-based approaches for attribution reduction. The techniques have been tested in combination with four classifiers and several attributes based on co-occurrence matrices, interaction maps, Gabor filter banks, and two different color spaces. Principal Component Analysis plays an important role in these methods. Experiments showed that for the blue wet leather defect inspection without singularity, the best case is to use 24 attributes, and for the original animal skin defect inspection without singularity, the best case is to use 16 attributes.
Villar et al. [58] chose features based on the Sequential Forward Selection (SFS) method, which allows a high reduction of the numbers of descriptors. These descriptors are computerized from grayscale image, RGB, and HSV color model, and there are 2002 features in total. The descriptors extracted can be classified into seven groups: (i) first-order statistics; (ii) contrast characteristics; (iii) Haralick descriptors; (iv) Fourier and cosine transform; (v) Hu moments with information about the intensity; (vi) local binary patterns; (vii) Gabor features. SFS allows one to rank descriptors based on their contribution to the classification. To determine the number of features required to classify, the following procedure is followed: a classifier is linked to each class of interest. Classifiers are trained with a determined number of features and the percentage of success in the classification is calculated. Successive training of the classifiers is performed, incrementing the number of features based on the ranking provided by SFS. Only 10 characteristics, from the universe of 2002 initially computed, are required.

Machine-Learning-Based Identification
Leather surface defect identification is essentially a classification problem. Defects should be classified into appropriate classes according to their cause and origin to locate the source responsible for those defects and take corrective action [3]. This classification process is necessary because it plays an important role in providing information for defect prevention. The traditional leather surface defect identification is used to identify defects by using a pattern recognition algorithm based on extracting image features as first-order statistical measures, second-order statistical measures, spectral measures, or image-level descriptors (local binary patterns and Gabor features). Commonly used algorithms such as k Nearest Neighbor (KNN), Neural Network (NN), Support Vector Machine (SVM), Bayesian Network (Bayes), and Decision Tree (DT) are widely used in the identification of leather surface defects. Based on the results reported in some literature, Table 3 presents some classification accuracy elements used by these algorithms for leather defect identification.
As can be seen from Table 4, the classification accuracy of most methods reached above 90% [59][60][61][62][63][64][65][66][67][68], and the KNN method in the literature [59] even achieved 100%. This performance can be partly attributed to all these methods being evaluated on very small local datasets [2,60]. As shown in Tables 4 and 5 [2]. Most of the leather defect classification methods in the literature only report the selected performance metrics on their custom data, which is one of the main reasons for the difficulty in conducting a comprehensive comparative evaluation of them. Notably, these datasets contain at most 10 categories of defects, but most of them include three to four categories. Although the dataset used by Jawahar et al. [52,61,62] contains 10 categories of defects, it is divided into two types: defect and no defects. All datasets used in the literature [14,41,63,66,68] contain only one defect, which is essentially a binary classification.   To further evaluate the performance of the above traditional machine learning methods in leather defect recognition, the SVM is selected for evaluation by using different feature sets listed in Table 6. It is the most commonly used method for leather defect identification as shown in Table 5. The dataset of literature [19] is used for the evaluation. SVC with Gaussian, Linear, and Polynomial kernel function is evaluated, where the optimal parameters are selected by cross-validation method, respectively. The experiment results in three sets of features as presented in Table 7. As shown in Table 7, there are two groups of experiments using texture features; the recognition accuracy of SVC with Gaussian, Linear, and Polynomial kernel function is not high. When the color feature is added, the maximum accuracy reaches 86% and the performance is greatly improved. Feature extraction and selection have a great impact on the performance of the algorithm. Feature extractor designing requires designers to have rich prior knowledge and it is commonly well designed manually by experienced engineers case-by-case, thus making the development cycle relatively complex and time-consuming. The challenge is that such a method can hardly be generalized or reused and may be inapplicable in a real application.

Feature No.
Feature Descriptions F1 The mean and variance of the histogram of gray image F2 The contrast, correlation, energy, entropy, and autocorrelation of GLCM of 0°, 45°, 90°, and 135° F3 Wavelet statistical features and wavelet co-occurrence matrix features [41] F4 The mean, variance, skewness, and kurtosis of color histogram of RGB and HSV image F5 The first, second, and third color moment RGB image. Leather products come mainly from cattle, crocodiles, lizards, goats, sheep, buffalo, and mink skins. Each kind of animal leather has a different texture and a different living environment. Yeh [3] collected and categorized a set of calf leather defects into 7 large categories by shape, 24 defects in regular shapes, and 17 defects of irregular types. Even the same type of defect varies greatly in shape, size, and color. More than 10 defects may be presented in one image with different contrasts. Therefore, the algorithms shown in Table 3, both the number of test sample sets and the types of defects identified by classification, are very different from the leather surface defects in practical industrial applications. Although the traditional machine learning method shown in Table 3 has high recognition accuracy, our experimental results show that the recognition progress only reached 86%. The recognition accuracy is greatly affected by the leather surface defect data and the extracted features. These results must be considered with caution, as each defect is only taken from two different pieces of leather, and does not represent all possible configurations of possible defects, for example, different size, color, and orientation [2]. This also means that in terms of using traditional machine learning methods, there is still a lot of work to be done.

Deep-Learning-Based Leather Defect Inspection
As described in Section 5, the shape of the leather surface defect image is changeable and random. There may be more than ten defects in one image. Even the same defect itself is very different in the image. The texture statistical feature extraction represented by the traditional gray level co-occurrence matrix has a large amount of calculation, and its effectiveness is also challenged by the high variation of leather surface defects. Deep learning (DL) adopts the hierarchical structure of multiple neural layers and extracts information from the input data through layer-by-layer processing. This "deep" layer structure allows it to learn the representation of complex original data with multiple levels of abstraction and to learn features directly from the original image. They perform feature engineering to yield natural features from images by combining both the traditional steps: feature extraction and classification, together as an end-to-end paradigm [52]. It has been widely used in the field of image processing and has achieved remarkable results. Aslam et al. [2] suggested that the deep learning architecture can be used as a source of guidelines for the design and development of new solutions for leather defect inspection. Currently, deep learning (DL) methods are advancing at a rapid pace and they have become a promising data-driven learning strategy for leather surface defect inspection [5,19,[69][70][71][72][73][74][75][76]. Different DL-based methods have been applied for leather defect inspection tasks such as detection and identification. Table 8 lists some DL-based applications for leather surface defect detection. Liong et al. [69] developed an automatic identification system of tick bite defects inspection based on Regional Convolutional Neural Network (Mask R-CNN), which can automatically mark the boundary of the defect region. Tick bite has slight surface damage on animal skin, which is often ignored by human inspection. Mask R-CNN is a popular image segmentation model that built a feature pyramid network (FPN) [57] with a Res-Net-101 [70] backbone. This is an end-to-end defect detection system. The robot arm is used to collect and mark defects automatically. To form a continuous bounding mask for each defect, all the selected points are connected in a counterclockwise direction using the Graham Scan algorithm. A set of optimal coordinates of the irregular shape of defects is obtained by using the mathematical derivation of geometric graphics. The number of sample images in the train and test datasets is 84 and 500, respectively. To make up for the shortage of training data, the Mask R-CNN model has been pre-trained extensively on a Microsoft Common Objects in Context dataset (MSCOCO) [71]. On top of performing the transfer learning from the pre-trained model to detect and segment the defects of the leather, the parameters (i.e., weights and biases) are iteratively adjusted through learning the features of the leather input images. The segmentation accuracy of the algorithm is 70.35%. From the perspective of segmentation accuracy, the robustness and effectiveness of the algorithm have great space for improvement, and only one defect is automatically identified. Following this work, Liong et al. [74] developed AlexNet and U-Net-based automatic defect detection techniques. U-Net was utilized to highlight the position of the defect, where the defect types focused on in this study were the black lines and wrinkles. Among 250 defective samples and 125 non-defective samples, the mean Intersection over Union rate (IoU) and the mean pixel accuracy achieve 99.00% and 99.82% for the defect segmentation task, respectively. Chen et al. [5] designed three architectures named 1D-CNN, 2D-Unet, and 3D-UNet to segment defect areas of five wet blue leather defects including brand masks, rotten grain, rupture, insect bites, and scratches in the pixel level detection, respectively. This work is the first analytical study using hyper spectral imaging for wet blue leather at the pixel level. For various characteristics of defects, 1D-CNN emphasizes defects with spectral features, 2D-Unet emphasizes defects with spatial features, and 3D-Unet simultaneously processes spatial and spectral information in hyperspectral imaging. 1D-CNN has the best result in detecting insect bites. The 2D-Unet takes advantage of spatial information so that it performs the best in a brand mask. The 3D-UNet considers spatial information and spectral information simultaneously. Therefore, it has the best performance in rotten grain, rupture, and scratch defects. Table 9 lists some DL-based applications for leather surface defect identification. Murinto et al. [72] used a pertaining AlexNet [73] to extract the image features of tanned leather and used SVM for classification. The dataset of the validation model contains 1000 flawless tanned leather images and five types of leather: giant lizard, crocodile, sheep, goat, and cow. The classification performance shows that the deep learning method can better capture the characteristics of leather, and the overall accuracy is 99.97%. However, this paper does not involve defect identification.

Deep Learning for Leather Defect Identification
Based on the ResNet-50, Deng et al. [19] carried out research on the identification of leather defects, and effectively classified four types of leather defects: scratch, rotten surface, broken hole, and pinhole. The average classification accuracy reached 92.34%, of which the recognition accuracy of a pinhole was 87.2%, and there is still a lot of space for improvement. This result is significantly better than the recognition accuracy using SVM shown in Table 7. Ding et al. [51] took nine common leather defects as the detection target, then fused the extracted features of a convolutional neural network with salient features to form a feature set, and the classification accuracy can reach more than 90%.
Liong et al. [74] applied pre-trained AlexNet to classify the three-category (no defect, black line, and wrinkle) leather images with 250 defective samples and 125 non-defective samples. The best performance obtained is 94.67% for the classification task; 375 sample data are not enough to train a deep learning model. Owing to the data scarcity issue, Gan et al. [66] adopted the Generative Adversarial Network (GAN) to discover the feature regularities to produce plausible additional training samples, which is based on Liong's work [74]. With the help of the GAN data enhancement strategy, the classification accuracy of the AlexNet-based model [66] increased from 94.67% to 100%, which is trained with a relatively small amount of readily captured training data. Another job [75] is to utilize AlexNet as the feature descriptor and use SVM as the classifier for the identification of noticeable open-cut defect, where the dataset contains 560 leather images with a spatial resolution of 140 × 140 × 3. Among them, 280 images have noticeable open-cut defects on the surface, while 280 images do not have defects at all. The result achieved is 100%.

Summary of This Section
As shown in Tables 8 and 9, we retrieved eight pieces of literature on leather surface defect inspection based on the deep learning model. Among them, the convolutional neural network plays an important role in feature engineering. The feature engineering process led by the CNN training procedure is encountered with high adaptiveness of deep learning paradigms. However, deep learning does not work so well with small data. With the available smaller datasets of leather images, handcrafted feature-based classical ML algorithms such as regressions, random forest, and SVM often outperform deep networks. Unfortunately for these leather defect detection applications, such large datasets are not readily available and are expensive and time-consuming to acquire. In addition, among the literature we investigated, most of the leather defects involved in the dataset are 3-5 kinds, and only one document has 9 kinds; the above work shows that deep learning is a potential tool in leather defect detection. However, the depth and breadth of leather defect detection based on deep learning are not enough. Liong and Gan [14,[64][65][66]69,[74][75][76] are a team who conducted relatively in-depth research in this field, but their research is only limited to the detection of a few leather defects such as black line, wrinkle, noticeable open cut, and tick bite.
In fact, defect detection based on deep learning has also been widely used in other industrial scenes in recent years. In the field of metallic surface defect detection, Natarajan et al. [7] proposed a flexible multi-layered deep feature extraction framework based on a CNN via transfer learning to detect anomalies in anomaly datasets. Masci et al. [77] used a multi-scale pyramidal pooling network for the classification of steel defects, which is based on CNN and can adapt to the input images of different sizes. Xian Tao et al. [6] proposed a CNN and cascaded autoencoder-based architecture for a metallic surface against complex industrial scenarios, which consists of detection and classification modules. In the field of the textured surfaces defect detection, Qiu et al. [8] proposed a fully convolutional network-based cascaded framework for pixel-wise surface defect algorithm, which combines a segmentation stage, a detection stage, and a matting stage. Mei et al. [10] proposed a Gaussian pyramid-based multiscale convolutional denoising auto-encoder architecture (MSCDAE) to detect and localize defects with only defect-free samples, which is an unsupervised learning-based defect inspection approach. Hu et al. [78] extends the standard deep convolutional generative adversarial network (DCGAN) and proposed DCGAN-based unsupervised method for automatically detecting defects in woven fabrics. Huang et al. [79] proposed a U-Net-based real-time model for the ceramic tile defect inspection, which consists of three main components: MCue, U-Net, and Push network. In the field of the cracks detection on the surface of the construction, Cha et al. [80] developed two CNN and Faster-Region-CNN-based structural damage detection models successively to detect five types of surface damages. In other miscellaneous defect detection, Li et al. [81] conducted a systematic review of deep transfer learning for machinery defect detection. Chen et al. [82] developed a vision-based system that applies the deep convolutional neural networks (DCNNs) in the defect detection of fasteners on the catenary support device. Napoletano et al. [83] applied region-based CNNs to the detection and localization of anomalies in scanning electron microscope images. Tabernik et al. [84] designed a segmentation-based deep learning architecture for surface-crack detection of an electrical commutator. Long et al. [85,86] presented a self-training semi-supervised deep learning method and a deep hybrid learning approach for machinery fault diagnosis. Zhong et al. [87] proposed a weighted residual regression-based index to provide monotonic trends for gear and bearing degradation assessment. Liu et al. [86] constructed Deep Belief Networks that are combined with a transfer learning strategy for surface defect detection of solar cell and capsule samples.
In summary, automated surface-anomaly detection using machine learning has become an interesting and promising area of research, with a very high and direct impact on the application domain of visual inspection. Deep learning methods have become the most suitable approaches for this task [84,88]. These works can inspire us to design and develop new solutions for leather surface defect inspection such as detection and identification.

Discussion and Conclusions
So far, we have summarized and evaluated the application of traditional image processing methods and machine learning models in the field of leather surface defect inspection including detection, identification, and so on. In this section, we discuss the various challenges that exist in the design and deployment of machine-vision-based so-lutions for leather defect inspection. Furthermore, this review will shed some light on how these challenges can be transformed into opportunities, leading to future research directions in this field.

Challenges and Opportunities
Although leather surface defect inspection is an important subject in industrial inspection, it has not been paid much attention. Among the literature reviewed, about 50% of the retrieved English papers are conference papers, and 60% of the Chinese papers fall within masters' theses. From the distribution of the authors, the nationality of the main researchers is from China, Brazil, Chile, Australia, India, and other places with relatively developed leather industries, and there is only one article from the United States. Apart from Liong and Gan's team [14,[64][65][66]69,[74][75][76] and Jawahar's team [41,52,61], there are few teams who conduct continuous in-depth research. At present, the actual application of the leather vision system has not been fully realized with automation and intelligence, and manual assistance is still needed for discrimination and identification.
In the leather industry, the earliest machine vision system is LeaVis [89], which requires manual operators to draw the boundary of the quality defect area, mark the area with specially designed stamps (called quality mark or Q mark), and indicate defects. Taurus XD leather cutting system launched by Gerber Technology Co., Ltd, tolland, conn., USA realizes four levels of defect inspection through visual inspection, but it still needs experienced technicians to assist in dividing the defect location. Lectra, a leading cutting technology and supporting service provider in the industry, developed the Dig-itLeather leather visual inspection system, which can record leather defect information and divide leather into six quality grades for processing. According to the current literature, these vision systems and proposed technical methods are aimed at specific defect categories, and the types that can be recognized are very limited. Theoretically, the algorithms shown in Table 5 have achieved good performance, but there is still a gap from a real application. There are still many problems in the practical application of automatic inspection of leather surface defects and corresponding machine vision technology. Relatively little work has been conducted in automated leather defect inspection, mainly because of the difficult nature of the problem. We therefore state that the following challenges may hinder the progress in this scintillating field of research.
(1) Small sample problem. Leather defect datasets are relatively small, and the types of defects covered by the dataset are incomplete, which is difficult to represent for leather defects with changeable morphology. As shown in Tables 3-5, the datasets used in most studies are customized. The Nelore and Hereford cattle dataset used by Amorim et al. [57] have 50 images of wet blue leather. The Campo Grande team of Dom Bosco Catholic University in Brazil built a dataset that is part of the Brazilian national scientific research and technology development project DTCOURO, which envisages the development of a computer-based, fully automated system for the classification and grading of rawhide and leather in bovine animals. All datasets except DTCOURO are relatively small, which limits the extensive evaluation of the developed algorithm. To address these issues, Aslam's team [2] is building a relatively large dataset, as is the authors' team; both teams are expanding the defect category and data scale of the dataset. (2) Data samples have a high degree of variance in terms of defects. Leather images show randomness in many changes in morphology and defects. There may be more than 10 defects in an image. Even a defect itself is very different in different images. It is practically difficult to construct exact models of leather surface defects for classification because their appearance and size vary greatly. (3) There is no unified standard for leather defect identification and classification in the industry. Inconsistent performance evaluation of algorithms and lack of common benchmark datasets are another obstacle to progress in this field. At present, the performance evaluation of developed algorithms is inconsistent and lacks a common benchmark. The difference in judging defects between the leather industry and leather products industry makes the technical indicators of quality inspection of various enterprises inconsistent, which seriously affects the quality of leather products production. Yeh [15] et al. established a compensation standard for leather defects to complete leather trading, and divided leather defects into seven types. Hoang et al. [90] realized the computerization of the quarter rule, which is the standard method for evaluating leather grade in the shoemaking industry. These research results provide a good foundation for establishing a unified standard for leather defect identification and classification, but it needs to be further refined in favor of practical application. (4) Real-time problem. Machine-learning-based defect inspection methods include three main links in industrial applications: data annotation, model training, and model inference. Real-time performance in real industrial applications focuses more on this part of model inference. Most current defect inspection methods focus on the accuracy of classification or identification, with little attention to the efficiency of model inference.

Future Research Directions
(1) Data augmentation. One reason for no large leather datasets is that most industries are reluctant to share their data with researchers. Leather defect classification and quality grading need to adapt to the high variability of leather defects in industrial environments, so sufficient data have to be collected and defect variations have to be captured to evaluate and improve the performance of the algorithm. In the field of leather defect inspection, there is an option to obtain large datasets, which is data augmentation. Data augmentation will not only increase the number of defects in the dataset but also increase defect variation. Aiming at the common small sample problem in surface inspection, a rare defect sample generation and random expansion algorithm needs to be constructed. An important research direction will be to design a unique data enhancement method for leather surface defect generation. The most commonly used defect image amplification method is to obtain more samples by image processing operations such as mirror image, rotation, translation, distortion, filtering, contrast adjustment, and so on. Another common method is data synthesis, in which individual defects are often fused and superimposed on normal (defect-free) samples to form defect samples. Those data enhancement methods are worth practicing in the field of leather defect inspection. (2) Network pertaining and transfer learning. Generally speaking, training deep learning networks with small samples can easily lead to overfitting. Therefore, the method based on pre-training network or transfer learning is one of the most commonly used methods for a small sample problem. In the field of leather defect inspection, there are not many pre-trained models available. The most closely related is the textured surface inspection such as textile inspection, wood inspection, and ceramic tile inspection. The weights of these models can be used for transfer learning, which is a research problem that needs to be investigated. (3) Reasonable network structure design. By designing a reasonable network structure, the demand for samples can also be greatly reduced. Based on the compressed sampling theorem to compress and expand the small sample data, CNN is used to extract the data features of compressed sampling directly. Compared with the original image input, compressed sampling can greatly reduce the sample demand of the network. In addition, the surface defect inspection method based on a twin network can also be regarded as a special network design, which can greatly reduce the sample demand. (4) Unsupervised or semi-supervised learning. In the unsupervised model, only normal samples are used for training, so there is no need for defective samples.
Semi-supervised method can solve the problem of network training in the case of small samples by using unmarked samples. The strategies have been used for the defect inspection of other industrial scenes, and are worth testing in the leather defect inspection. (5) Accurate semantic segmentation. In addition to being able to identify defects, it is necessary to accurately segment the extracted detailed information such as defect shape, size, position, color, and type. Semantic segmentation is an effective strategy to achieve this in deep neural networks. Full Convolution Networks (FCNs) have made good progress in semantic segmentation in practical scenes, medical image segmentation, and industrial defect inspection. Other semantic segmentation models based on deep learning are mostly developed based on FCN. They may be suitable for leather defect segmentation. AlexNet-and ResNet-architecture-based networks can adapt to the task of leather defect segmentation, but they also need to be deeply studied in combination with the actual situation of leather defects.