A Survey of Vision-Based Methods for Surface Defects’ Detection and Classification in Steel Products

: In the competitive landscape of steel-strip production, ensuring the high quality of steel surfaces is paramount. Traditionally, human visual inspection has been the primary method for detecting defects, but it suffers from limitations such as reliability, cost, processing time, and accuracy. Visual inspection technologies, particularly automation techniques, have been introduced to address these shortcomings. This paper conducts a thorough survey examining vision-based methodologies related to detecting and classifying surface defects on steel products. These methodologies encompass statistical, spectral,texture segmentation based methods, and machine learning-driven approaches. Furthermore, various classification algorithms, categorized into supervised, semi-supervised, and unsupervised techniques, are discussed. Additionally, the paper outlines the future direction of research focus.


Introduction
In the pursuit of maintaining competitiveness, steel manufacturers have prioritized not only high productivity but also superior product quality.A critical aspect of achieving this goal is the quality control of steel surfaces.Compared to automated alternatives, manual inspection methods prove less reliable in high-speed production environments due to their time-consuming nature, high dependency on inspector expertise, and elevated costs [1].Consequently, automated inspection systems for steel surfaces have garnered significant interest, promising enhanced performance, efficiency, profitability, cost reduction, and product quality improvement.Automated defect inspection primarily aims to identify and characterize image deformities, offering advantages such as reduced labor costs and increased productivity [2].Numerous studies advocate for automatic defect detection to bolster quality [3].In automated inspection systems, surface images of industrial products are typically captured using charge-coupled device (CCD) cameras under specific lighting conditions, as shown in Figure 1.Subsequently, various image processing methods are employed, followed by defect detection utilizing structural or statistical techniques [4].Automated defect inspection generally comprises two steps: defect classification, addressing "What is the defect?" and defect detection, determining "What and Where is the defect?" and outputting respective scores and bounding boxes [5]. Figure 2 illustrates both defect detection and classification tasks on steel surfaces.Recently, numerous algorithms have been proposed to detect surface defects, with steel surfaces being a focal point [6,7].Several reviews have investigated detecting and classifying defects on steel surfaces, offering valuable insights for researchers in the field [8][9][10][11][12][13][14][15].For instance, ref. [8] navigates the complexities of automatic surface inspection and evaluates different detection and classification methods for steel surface defects.Meanwhile, ref. [12] reviews hardware and software aspects of visual detection, categorizing detection methods into statistics, models, filtering, and machine learning and classifying classification methods into supervised and unsupervised learning.Luo et al. [13,14] have conducted comprehensive surveys on surface defect detection and classification methods over the past two decades, categorizing methodologies into statistical, spectral, model-based, and machine learning approaches for flat steel products.Czimmermann et al. [12] focus on visual inspection techniques for various surfaces in industrial applications, while Zheng et al. [16] delve into deep learning-based methods for surface defect inspection in various manufacturing processes.Additionally, Jin et al. [13,14] explore defect detection in steel products using traditional image processing and deep learning techniques, respectively.From these reviews, a typical visual inspection system for steel surface strips can be categorized into three main components: an image acquisition unit, visual processing algorithms, and a control unit.This study presents modern visual processing methods specifically for detecting and classifying steel-strip defects, building upon previous reviews and providing an updated overview of the field's current status.Given the complex nature of detecting and classifying defects within the industrial settings of steel-strip production, endeavors to enhance quality control continue, with researchers diligently exploring diverse methodologies to achieve heightened efficacy.Hence, we attempt to provide a comprehensive overview of cutting-edge techniques for detecting and classifying surface imperfections in steel, aiming to contribute to improving quality assurance measures in this domain.
The contributions of this paper are as follows: 1.
We provide a comprehensive overview of existing methods for detecting and classifying defects on steel surfaces, encompassing more than 200 studies.2.
We present an analysis of the performance evaluation of various state-of-the-art algorithms for detecting and classifying defects on steel surfaces.3.
We discuss evaluation metrics commonly used in steel surface defect detection.By providing insights into these evaluation metrics, the paper aids in assessing the performance of defect detection methods.4.
We provide an overview of diverse state-of-the-art methods employed in detecting and classifying steel surface defects, emphasizing their strengths and weaknesses.

Steel and Its Surface Defect Types
Steel products can be broadly categorized into flat products and long products, as illustrated in Figure 3 [17].Of late, hot strips, cold strips, and plates have garnered increased attention from researchers.The steel surface unavoidably accumulates various defects throughout the production process, which involves processing, casting, and rolling.These defects include scratches resulting from the relative movement between a hard, sharp object and the steel surface or between two steel surfaces such as plates and strips.Scales, primarily caused by incompletely removed impurities and greasy dirt on work rollers during temper rolling, are also common.Roll marks, attributed to poor roller shape or excessive curl, not only affect the product's appearance but also diminish its key features like corrosion resistance, abrasion resistance, and fatigue strength, leading to significant economic losses [18].Figure 4 showcases examples of typical steel surface defects.According to Neogi et al. [17], no universally agreed standard exists for these defects.The diverse array of steel surface products exhibits a wide range of defects, characterized by both similarities and diversities within these groups.This diversity and similarity challenge the defect classification process [19].

Defect Detection Methods
Researchers have explored numerous methods to enhance the quality of steel products and facilitate automated inspection [20].These methods can be categorized into four main groups: statistical methods, spectral methods, texture segmentation based methods, and machine learning-based methods, as outlined in Figure 5.This section provides a concise overview of various approaches and techniques proposed for quality monitoring in steel manufacturing settings.

Statistical Models
Statistical models [21][22][23] leverage probability theory and statistical analysis to formulate mathematical models capable of quantitatively predicting, analyzing, inferring, and summarizing the spatial distribution of pixel gray values.In recent years, statistical methods have gained traction for detecting defects in steel surfaces.Table 1 compares the strengths and limitations of various statistical detection methods for steel defects.This section offers a succinct introduction to seven representative statistical approaches as follows:

Strengths Limitations Applications
Autocorrelation Simple to utilize with repetitive textures like textiles Difficulties in identifying nonlinear relationships; not suitable for textures with randomly arranged textural elements [24] Thresholding Easy to understand, simple to implement A small difference in the background can make defect detection fail [25] Co-occurrence matrix Pixels' spatial relationship can be extracted with different statistical computations Difficult to referee the optimal displacement selection [26,27] Local binary pattern Faster discriminative feature extraction with rotation in gray invariance Noise and scale change have a significant influence; highly dependent on the gray value of the image's center point [28][29][30][31][32] Fractal model

Remains unaffected by significant geometric transformations and variations in lighting
Has limitations on images without self-similarity; unsatisfactory detection rate [33] Edge-based Easy to extract low-order features of the image and simple to realize It is only suitable for images with low resolution; sensitive to noise [34] Histogram properties Clarity, invariant to translation and rotation; simple calculations Poor detection rate for irregular textures less than 70% [35][36][37][38] 3.1.1.Autocorrelation The autocorrelation technique serves as a valuable feature for determining the size of tonal primitives [39].It aims to establish correlations between textures and their translations using displacement vectors, particularly identifying vertices in cases of high regularity.Notably, this technique demonstrates exceptional robustness against lighting variations and noise.Methods relying on autocorrelation techniques find applications in analyzing textures characterized by repetitive patterns, such as textiles.The mathematical expression for one-dimensional autocorrelation is computed as follows: where r represents the autocorrelation value, N represents the total number of samples, t is the time, x is the normalized signal value, and τ is the shift value.Zhu et al. [24] combined autocorrelation and the gray-level co-occurrence matrix (GLCM) to detect yarn-dyed fabric defect; however, the results were not reported.

Thresholding
The main objective of thresholding methods is to separate objects from the image background.It is one of the common approaches used for image segmentation.In steel-surface inspection systems, thresholding is often used to separate defective regions on steel surfaces.It has been recently widely applied in automated visual inspection [40].Djukic et al. [25] used dynamical thresholding to discriminate true defects from random noise pixels for hot-rolled steel.However, no quantitative results were reported for that study.

Gray-Level Co-Occurrence Matrix
The gray-level co-occurrence matrix (GLCM) is a statistical method that uses spatial gray-level co-occurrence to analyze image texture.First proposed by Haralick et al. [39], it is also known as the gray-level dependency matrix.It has the ability to construct a matrix that quantifies the spatial relationship of pixels over an image.As described in [41], given an image I of size N × N, the co-occurrence matrix (P) is expressed as where where δ x (d, θ) and δ y (d, θ) are used to calculate the position of (x, y) in relation to its neighbors at a distance d and direction θ.
Haralick's features could then be extracted from the gray level co-occurrence matrix to characterize steel surface.
Guo et al. [26] proposed a feature extraction method based on the GLCM to characterize four types of defects, including edge crack, pinch, and inclusion.This method was then used to analyze forehead wrinkles extracted from steel strips.The higher recognition rate obtained from the pinch defect type was 85.00%.Zaghdoudi et al. [27] presented a method for the classification of steel defects based on machine vision techniques and support vector machine (SVM).They composed two sets of features, GLCM and the Histogram of Oriented Gradients (HOG), to extract the features from the training database.The recognition accuracy was 90.16%.Nevertheless, the computing time taken by the GLCM was long, and the process also required relatively a high storage space.

Local Binary Pattern
The local binary pattern (LBP), introduced by Ojala et al. [42], stands out as a remarkably efficient and straightforward feature descriptor.It operates by relabeling each pixel in an image through a coding mechanism based on comparisons between the original pixel value and those of its neighboring pixels [43].This approach has garnered significant attention across various applications owing to its computational simplicity and discriminative power [44].In recent years, several researchers have employed the LBP to detect steel surface defects [45,46].Abukhait [28] proposed an inspection system combining the LBP's uniform patterns' histogram and GLCM textural features to construct feature descriptors for surfaces, achieving a recognition accuracy of 95.60%.Makaremi [47] and Tajerip [48] introduced a new technique based on modified local binary patterns for detecting textures and fabrics, achieving detection rates of 91.86% and 95.00%, respectively.Wang et al. [29] introduced the entity sparsity pursuit (ESP) algorithm for identifying steel surface defects, utilizing an intuitive LBP-inspired feature extractor for industrial grayscale images.Luo et al. [30] proposed a generalized completed local binary pattern framework for steel surface defect classification.Additionally, Luo et al. [31] introduced a method to enhance classification accuracy and time efficiency for existing LBP variants in steel defect classification tasks.Gyimah et al. [32] presented the RCLBP method, which combined the completed local binary pattern and an NL-means filter with wavelet thresholding to extract noise-robust features for surface defect detection.However, despite its computational simplicity, the LBP is highly sensitive to noise and scale changes.

Fractal Model
The fractal model, introduced by Mandelbrot in 1983 [49], serves as a valuable tool in computer vision for recognizing and interpreting various objects.Two key metrics within the fractal model are the fractal dimension and porosity.The formula for calculating the fractal size is provided below: where r represents the ruler scale, and N denotes the number of scales obtained with r.In a study by [33], a defect detection algorithm based on multifractals was proposed, focusing on detecting surface defects in cold strips.The algorithm utilized ten features to reduce classifier complexity and successfully identified five types of defects, including pickled, annealing stain defect, emulsion rust, dirty surface, and sticking, achieving an average accuracy of 97.90%.

Edge-Based Features
Edge detection encompasses a collection of mathematical techniques designed to pinpoint locations in a digital image where there are abrupt changes in brightness [50].These techniques are utilized to delineate regions within the image based on color rendition.Wen and Xia [34] introduced a method to discern and evaluate candidate edges against predetermined conditions for leather surfaces.

Histogram Properties
Histograms provide a visual representation of the distribution of data values within an image.Among the commonly used histogram statistics are the geometric mean, standard deviation, range, harmonic mean, variance, and median.These properties offer advantages such as low computational cost, translation, rotation, and scale invariance, particularly in applications involving thresholding and segmentation of grayscale images, color-based image classification, and image retrieval.Kobayashi [51] utilized histograms as feature extractors for classifying defects on steel surfaces.Similarly, Kholief et al. [35] employed a method based on two statistical features derived from histograms and edge detection for detecting defects on steel surfaces.Furthermore, various studies have implemented histogram properties for defect detection on surfaces composed of different materials [36,37].Despite their utility as low-level processes in defect detection, these methods have demonstrated effectiveness in the field.

Spectral Methods
Spectral methods can be used for images that have a uniform structure, such as fabric patterns that are created periodically, and they are not suitable for textures that are not periodic in images with random textures.By using spectral methods, the defect objects' separation can be easy from both the global and local backgrounds.In Table 2, the advantages and disadvantages of several spectral methods are compared.Eight representative spectral methods are briefly introduced below.

Fourier Transform
The Fourier transform is a mathematical operation used to decompose a signal or a function of time into its constituent frequencies.In image processing, the input signal is typically represented in the spatial domain (x, y).The Fourier transform of an image is expressed by the following equation, where u and v denote frequency variables ranging from 0 to N − 1 and 0 to M − 1, respectively: where f (x, y) denotes the gray-level intensity of the pixel at position (x, y), and N and M represent the width and height of the image, respectively.Yazdchi et al. [33] proposed a defect detection method based on a multifractal analysis for steel surfaces.In that approach, the Fourier analysis was employed to temporally enhance the image temporally, followed by the utilization of the multifractal dimension to isolate the defective region from the image and specify its location.The achieved accuracy was 97.90%.However, Fourier transform based approaches may encounter challenges in achieving non-interference when dealing with frequency-domain components related to defects or background.

Wavelet Transform
The wavelet transform emerged as an alternative to the Fourier transform to address potential resolution issues.Unlike the Fourier transform, which decomposes the signal into cosines and sines, the wavelet transform utilizes localized functions in both the real and Fourier space [52].Aarthi et al. [53] employed the discrete wavelet transform to isolate defects by applying a threshold to the transformed image and extracting different statistical features.Similarly, the imperfections were identified using the threshold of the transformed image in order to exclude them.Selvi [54] introduced wavelet and lifting schemes combined with co-occurrence features to distinguish between normal and defective steel surfaces.Wavelet filters were utilized for noise removal, while co-occurrence features were extracted for further analysis.However, methods based on wavelet transforms can be susceptible to feature correlations between scales.

Gabor Filter
The Gabor filter, a linear filter widely employed in numerous image processing tasks, including texture analysis, edge detection, and feature extraction, plays a significant role in various applications.Considering an image, I, the two-dimensional Gabor residuals r(x, y) are obtained by performing the convolution of I(x, y) with the Gabor function defined in Equation ( 6) and formulated as [55] r(x, y) = Ω I(ξ, η)g(x − ξ, y − η)dξη (6) where g(x, y) is defined as where λ, θ, ϕ, σ, and γ represent the wavelength of the Gabor function cosine factor, orientation normal to the parallel stripes of the Gabor function, phase offset of the Gabor function cosine factor, standard deviation sigma of the Gaussian factor, and ellipticity of the Gaussian factor, respectively.The Gabor filter has found wide application in detecting defects on steel surfaces, as evidenced by several studies [56][57][58].Choi et al. [59] proposed combining morphological features and Gabor filtering to detect pinholes on steel slab surfaces.Similarly, Wankhede [60] used Gabor filters to inspect defects on texture surfaces of metal sheets automatically.Medina et al. [61] developed a method employing Gabor filters to detect defects on flat products in a flat-steel cutting factory, achieving a detection rate of 96.61%.These studies highlight the effectiveness of Gabor filtering in characterizing distinctive texture patterns and its utility in detecting defects on steel surfaces.

Optimized FIR Filters
Finite impulse response (FIR) filters provide remarkable feature separation between non-defective and defective regions in FIR-filtered frames [62].Kumar [63] introduced a method utilizing FIR filters for defect detection in fabric, yielding significant results within the textile industry.Moreover, Jeon et al. [64] introduced a new filtering technique that utilized lighting methods for identifying different shapes of defects on steel surfaces.The best filter parameters are hard to determine; difficult to maintain spatial orientation [80][81][82][83][84]

Multiscale Geometric Analysis
Multiscale geometric analysis (MGA) offers a versatile approach for detecting edge features and distinguishing lines and surface singularities arising from the finite separable wavelet directions [10].Ai et al. [66] introduced a novel feature extraction technique leveraging kernel locality-preserving projections (KLPP) and the curvelet transform to detect surface longitudinal cracks in slabs.A sample set classification was performed using an SVM, resulting in a classification rate of 91.89%.However, effectively distinguishing active background textures proved challenging, as they were often mistaken for defective edges, necessitating further investigation into this issue.

Hough Transform
The Hough transform is a method for extracting features that use a voting system to identify approximate examples of objects belonging to a specific class of shapes [85].Invented by Hough in 1962 for detecting intricate lines in images [86], it aids in separating features of a specific shape within an image, which is especially useful for generating a comprehensive description of features across the entire image.The Hough transform has found applications in various fields, including vehicle license plate recognition [87] and fingerprint identification [88,89].Sharifzadeh et al. [73] proposed a Hough transform detection method to detect defects such as scratches, holes, coil breaks, and rust on coldrolled steel strips.However, the detection rate achieved remained below 90.00%.

Morphological Operations
Morphological operations have been used to study the enhancement of images with respect to appearance, shape, and organization.It has also been used in steel-strip analysis [90].Several studies have explored the combination of mathematical morphology with genetic algorithms to develop an algorithm capable of detecting defects on steel surfaces [91,92].Using image processing techniques, a defect detection algorithm was developed by Joen [93] to perform corner crack detection in the surfaces of billets.Landstr et al. [76] proposed an online inspection method that used mathematical morphology to reduce the effect of noises and perform edge detection on steel surfaces.Tang et al. [77] used mathematical morphology to detect steel surface defects.Additionally, Yazdchi et al. [78] employed local entropy in combination with morphology to identify defect positions in cold-rolling mill steel.In [94], an algorithm for defect detection on steel billets was introduced, leveraging mathematical morphology to recognize both defects and pseudo-defects.Other papers on morphological methods for defect detection are presented in [95][96][97][98].Nevertheless, these techniques depend on a structural element that traverses pixels within an image to gather image information.Consequently, the computational cost must be carefully considered, especially in online applications for inspecting steel-strip surfaces.

Frequency-Domain Analysis
Frequency domain analysis involves transforming an image into the frequency domain utilizing a mathematical transformation technique, which can help address limitations associated with spatial filtering methods.Wu [74] and Wu et al. [40] introduced two online surface inspection methods for cold-and hot-rolled strips.Additionally, Wu et al. [75] presented a method for detecting defects in hot-rolled strips, utilizing the fast Fourier transform as a feature extractor and combining genetic algorithms and neural networks for steel surface defect recognition.The achieved performance using frequency-domain features was 92.92%.

Spatial Filter
The spatial filtering technique involves determining the value of a pixel at a given coordinate based on the original value of that pixel and the original pixel values of its neighboring pixels [99].These techniques are broadly classified into two categories: linear filtering operations and nonlinear filtering operations, both of which are integral to image analysis.Gradient filters, for instance, are employed to detect edges, lines, and isolated points.Guan [80] introduced an algorithm based on saliency map construction, which detected defects on steel surface products by performing a Gaussian pyramid decomposition on the discrete frequency information of steel surfaces.Additionally, Alkapov et al. [81] developed a prototype for automatic visible defect detection and classification in metallurgical plants, utilizing the Sobel operator to identify gradients along two axes, achieving an accuracy of 98.10%.

Texture Segmentation-Based Methods
Spectral-based methods inherently lack local information and struggle to effectively represent various defects, while statistical-based methods tend to be more noise-sensitive.Table 3 provides an overview of the strengths and limitations of model-based methods for detecting steel surface defects.

Markov Random Field Model
The Markov random field (MRF) model, initially used as a texture model by Cross et al. [109], primarily captures the statistical properties of images through non-directional graphs.This model finds applications across various problems in image processing [110].The Markov random field equation is defined as where P(S|W) represents the conditional probability distribution based on observations, P(S) denotes the fixed value based on observed values, and P(W) signifies the prior probability.Unsalan [100] applied the GLCM and MRF as a texture analysis method to detect and classify six types of steel surfaces using the k-nearest neighbors (KNN) classifier, achieving a higher classification result with an MRF of 91.36%.Ozdemir [101] investigated a novel method based on Karhunen-Loeve Transforms and a model-based approach with an MRF as the texture model for textile fabric defect detection.

Autoregressive Model
The autoregressive (AR) model discerns correlations between values within a time series and their preceding and succeeding values, enabling the linear prediction of future behavior based on past observations [29].This method offers notable time savings compared to nonlinear approaches [111].Hajimowlana et al. [102] introduced a one-dimensional autoregressive method for texture modeling and defect detection in web inspection systems, although its application to two-dimensional textures was limited.Basu and Lin [103] investigated using the autoregressive (AR) process for tree texture modeling, employing a multi-scale AR texture model for fabric samples.Serafim [104,105] utilized multiresolution pyramids with 2D autoregressive models for leather defect segmentation in natural images, albeit without providing quantitative results.

Weibull Model
The Weibull distribution, commonly employed in life data analysis, offers distinct advantages in detecting defects that may be challenging to identify with an MRF.Its comprehensive descriptive capabilities regarding texture contrast, shape characteristics, and scale contribute to its efficacy [112].Timma et al. [106] introduced a novel approach to defect detection in textures, leveraging two distinct features.Liu et al. [107] presented a method based on an unsupervised approach.Their study proposed a novel HWV model for defect detection on steel surfaces, achieving a detection rate of 96.20%.However, the Weibull distribution may struggle to detect defects in samples characterized by a low contrast or gradual intensity variations.

Active Contour Model
The active contour model, also known as snakes, is a computer vision framework introduced by Kass et al. [113], designed to delineate object outlines from potentially noisy two-dimensional images.This model finds widespread application in various domains including segmentation, shape recognition, object tracking, and edge detection, addressing various segmentation challenges [108].Notably, it has been observed that this method is adept at detecting nearly all microdefects without generating false objects, even amidst cluttered backgrounds.However, the absence of constraints makes determining the convergence position challenging for the active contour model.

Artificial Neural Networks
Artificial neural networks (ANN) represent an information processing model inspired by the functionality of biological nervous systems, such as the brain or central nervous system [114].The multilayer perceptron (MLP) is widely used among various ANN architectures.A simple MLP architecture comprises one input layer, one output layer, and one hidden layer, as illustrated in Figure 6.The classifier based on an MLP for defect classification in steel surfaces is structured as follows: the input layer's size corresponds to the number of components (m) in the feature vectors representing the steel surface (X = (x i ) i=1,2,...,m ), while the output layer's size represents the number of defect types to be detected (Y = (y i ) i=1,2,...,n ).The size and number of hidden layers depend on the specific application.Equations ( 9) and ( 10) are employed to compute the value of the ith neuron in the lth layer, u l i , by summing the products of values from the previous layer l − 1 and their corresponding weight parameters W = (W 1 , W 2 , . . ., W L ), where W i = (w i1 , w i2 , . . ., w iS i ), and S i represents the size of the ith layer.
where w i0 denotes the bias term, S l represents the size of the l th layer, and S L = n.
In Equation (11), the activation function g compresses outputs to the range from zero to one, yielding probabilities that a given feature vector for a road segment belongs to a particular class.
Equation ( 11) determines the predicted class label for each steel surface input feature vector X.Training the MLP classifier involves obtaining predicted class labels, v 1 (X k , W), v 2 (X k , W), . . ., v n (X k , W), based on randomly initialized weight parameters W for steel surface feature vector X k , given a training sample representation (X k , Y k ), k = 1, 2, . . .N, where Y k denotes the class of X k , and N is the training sample size.
Several attempts have been made to utilize ANNs to control steel surface quality.Zhang [115] proposed an improved backpropagation (BP) algorithm for quality inspection of cold-rolled strips, emphasizing a modification that accelerated convergence.Caleb [116] discussed two adaptive computing techniques, based on supervised and unsupervised learning, to establish a foundation for building a reliable decision support system for classifying steel surface defects.Redmon [117] introduced YOLO9000 for detecting over 9000 object categories.Li et al. [118] proposed a system to detect surface defects in coldrolled steel, achieving an accuracy of 98.57%.

Moving Center Hypersphere
The Moving Center Hypersphere (MCH) technique, introduced as a method for sample compression [119], operates on the principle of using hyperspheres to represent clusters of points, thereby approximating each sample with a number of hyperspheres.This approach enables the representation of each class of patterns in n-dimensional space using a series of n hyperspheres.In contrast to traditional methods such as K-NN and neural networks, MCH methods offer the advantage of representing patterns from each class in n-dimensional space using a series of n hyperspheres, whereas traditional approaches treat patterns from one class as a set of points [120].Chu et al. [121] introduced a novel multiclass classification technique, quantile hypersphere based on machine learning (QH-ML), to identify six types of steel surface defects.However, determining the optimal parameters for the MCH presents challenges.Recently, hyperspheres have gained attention from researchers for detecting defects on steel surfaces [122].

Sparse Coding
Sparse coding refers to a set of algorithms designed to learn a valuable sparse representation of given data.These algorithms require only input data to acquire the sparse representation, making them particularly useful as they can directly process raw data and automatically uncover the representation without sacrificing essential information.It is worth noting that sparse coding operates as an unsupervised learning method.The mathematical expression for sparse coding is represented as follows: where, x denotes the input vector, and a represents the weight.Huangpeng et al. [123] introduced a method for defect classification in images using transfer learning and sparse coding, aiming to enhance the accuracy of defect classification.Zhou et al. [124] proposed a class-specific and shared-dictionary learning approach to achieve sparse representation, facilitating the classification of surface defects on steel sheets.Furthermore, Liu et al. [125] presented a novel sparse-coding model for object recognition and object feature representation.They also incorporated a flexible data selection mechanism within the photo-receptor layer to enhance the speed and accuracy of detecting defects on steel plate surfaces.Despite these advancements, the computation time remains a limiting factor for real-time defect detection.

Deep Learning-Based Steel Surface Defect Detection Methods
Deep Learning is a specific subset of machine learning techniques concerned with algorithms inspired by the structure and function of the brain.It utilizes multiple layers of artificial neural networks for the machine learning process.Some of the popular deep learning models, like CNNs, generative adversarial networks (GANs), and convolutional autoencoder (CAE), are widely used in the process of extracting features from images of steel surfaces.This section provides an overview of current research on detecting defects on steel surfaces.The methods employed for this task using deep learning techniques can be categorized into three main approaches: supervised, semi-supervised, and unsupervised methods.

Supervised
Supervised deep learning methods leverage labeled training data to learn complex patterns and characteristics of steel surface defects.Supervised learning enables the neural network to learn discriminative features and make accurate predictions automatically by providing the model with pairs of input images and corresponding defect annotations.By harnessing the power of labeled training data and sophisticated neural network architectures, these methods pave the way for more reliable quality control systems in industrial settings.For enhancing defect detection accuracy, Zhao et al. [126] integrated a feature pyramid network into the You Look Only Once Network Version 4 (YOLOv4) architecture, achieving an average detection accuracy of 92.50%.Semantic segmentation-based methods have also been explored, with Zhou et al. [127] utilizing semantic segmentation for steel defect detection and Dong et al. [128] introducing the PGA-Net method for pixelwise defect detection, achieving an 82.25% accuracy.Gao et al. [129] proposed a hierarchical training CNN with feature alignment to improve the recognition accuracy of steel surface defects.Recent advancements include transfer learning-based approaches, such as Abu et al. [130] using MobileNet, ResNet, and Visual Geometry Group (VGG) models for steel defect detection, achieving an 80.41% detection rate with MobileNet.Additionally, kateb et al. [131] utilized a pre-trained ResNet-50 network for steel defect classification.Furthermore, Chen et al. [132] proposed an aluminum profile surface defect detection method based on a deep self-attention mechanism, achieving a 98.70% accuracy using a ResNet model.Litvintseva et al. [133] compared E-Net, DeepLabV3, and U-Net models for metal surface defect recognition, with DeepLabV3 achieving the highest accuracy.Fadli et al. [134] employed VGG-16 and VGG-19 models for image recognition of steel surface defects, achieving performance values of 97.20% and 93.30%, respectively.Gao et al. [135] proposed a lightweight inspection network for multi-class steel plate surface defect detection.Lian et al. [136] utilized an adversarial network to generate numerous exaggerated defect samples, thereby improving the classification accuracy for identifying tiny flaws within single images.

Unsupervised
Unsupervised deep learning methods are advantageous when there is a scarcity or high cost associated with acquiring labeled training data, as they do not necessitate such data.Alternatively, these techniques use the organization and recurring patterns within the data to obtain significant representations through self-regulation or clustering.Mujeeb et al. [137] introduced a method based on unsupervised learning utilizing a deep autoencoder network for detecting surface-level defects.Meanwhile, Zhao et al. [138] proposed a combined approach integrating an autoencoder (AE), a generative adversarial network (GAN), and LBP to detect defects on textured surfaces.Notably, labeled samples were not required in these approaches due to their unsupervised nature.Mei et al. [139] proposed an unsupervised learning method involving the construction of a convolutional denoising autoencoder architecture based on a Gaussian pyramid to discern defective and defect-free regions.Additionally, Youkachen et al. [140] presented a model based on unsupervised learning using CAE and image processing for defect segmentation in various steel surface forms.

Semi-Supervised
Semi-supervised learning techniques combine the benefits of supervised learning, which uses labeled training data, with the scalability and flexibility of unsupervised learning, which utilizes unlabeled data to acquire meaningful representations.Yiping et al. [141] proposed a semi-supervised learning technique employing a CNN, with the CNN enhanced by pseudo-labeling, to recognize defects on steel surfaces.Meanwhile, Zhang et al. [142] developed a semi-supervised generative adversarial network (SSGAN) for image defect detection.The SSGAN comprised two key sub-networks: a segmentation network and a fully convolutional discriminator (FCD) network.The segmentation network employed a dual attention mechanism to precisely segment defects from both labeled and unlabeled images.In contrast, the FCD network utilized adversarial and cross-entropy loss functions to generate confidence density maps for unlabeled images in a semi-supervised learning fashion.Additionally, Zheng et al. [143] introduced a deep learning method based on a generic semi-supervised approach, which required only a small quantity of labeled data for surface defect inspection.Table 4 summarizes the strengths and weaknesses of various machine learning methods for detecting steel defects.

Comparison of Some Defect Detection Methods
Many studies on the detection of steel surface defects have been conducted in the literature, and some defect detection methods are listed in Table 5.This table focuses on the detection methods, their relevant references, the type of steel used, the types of defects, the size of the dataset, the reported accuracy of detection, and the advantages and limitations of each method.

Defects Classification Methods
The classification of steel surface defects is the process of categorizing these defects into various classes, such as crazing, patches, scratches, etc., either automatically or manually.Various factors, including grease, dirt, impurities, temperature changes, high humidity, and more, can cause defects on the steel surface.These defects can cover only small regions of the image, a comparatively larger area of the image, or be whole-area defects.Different descriptors are used to describe these defects, requiring different feature extraction techniques and classification methods.Steel surface defect classification aims to ensure the quality of steel products by accurately identifying the type of defect.This requires high efficiency and accuracy from the classification methods, making it challenging for researchers.Three main categories of classifiers are used for this task: supervised, unsupervised, and semi-supervised.

Supervised Classifiers
In supervised classification, the user can specify which pixels in an image represent certain classes by labeling the data.That means the user does the data labeling, so the labeled data of steel surface images would tell the model which image includes the scratch, hole, scarring, etc.When presented with a new steel surface image, the model uses the labeled training data to model its class and later make a prediction of the class to which a new item belongs.Various supervised classification methods have been proposed, including KNN, ANN, SVM, discriminant function (DF), and fuzzy logic (FL).

K-nearest Neighbors Classifiers
The k-nearest neighbors classifier (K-NN) is a machine learning algorithm that operates based on the distance between observations.In the K-NN method, an object is classified by a majority vote among its neighbors.Specifically, the object is assigned to the most common class among its k nearest neighbors.During the training stage, the algorithm stores feature values and target vectors of training data instances [167].Various researchers in the steel surface defect inspection field have employed the k-nearest neighbors classifier [65,168].For instance, in their work [165], Boudiaf et al. developed an automatic system to detect surface defects in hot-rolled flat steel.Their method used the HOG for image feature extraction and a k-nearest neighbors classifier for defect classification, achieving a recognition accuracy of 91.12%.Similarly, Cem et al. [100] utilized the nearest neighbor classifier to classify grades on steel surfaces.However, that method failed to meet real-time requirements.

Artificial Neural Network
An artificial neural network (ANN) is an information processing system designed to emulate the functioning principles of the human brain [114].Neural networks consist of interconnected neurons capable of performing complex computations.In the structure of an artificial neural cell, inputs represent external data, while weights indicate the significance of incoming data and their impact on the cell.The transfer function calculates the net input to a cell, with the weighted sum function being commonly used.Activation functions determine the output generated by the cell based on its input and process the net input.Various activation functions like sigmoid, threshold, and hyperbolic tangent can be employed.The output represents the value sent to the external world or another cell, and the activation function influences its calculation.ANNs can mainly be categorized into feedforward neural networks (FFNNs) and backpropagation or feedback neural networks (FBNNs) [169].FFNNs are made of interconnected layers, where information flows only from input to output; Examples of FFNNs are single-layer perceptron and multilayer perceptron.FBNNs are neural networks in which a specific algorithm is used for training, such that the error is propagated backward through the network to adjust the weights, thereby optimizing the networks' performance.Examples of FBNNs are Kohonen's self-organizing map and recurrent neural networks (RNNs).They find wide applications across domains like system diagnostics, pattern recognition, robotics, nonlinear control, and signal pro-cessing.Khalifa et al. [35] proposed a method for detecting and classifying surface defects on hot-rolled steel strips.The approach involved using histogram and edge detection for feature extraction while employing two classifiers for defect classification: an ANN as a supervised classifier and a DAN as an unsupervised classifier.Sarma et al. [170] proposed a method for detecting surface defects on hot-rolled steel sheets, employing a three-level 2D Haar wavelet transform and training an ANN classifier for texture feature extraction.The approach achieved promising results when tested on 45 defect-free and 55 defective images.Similarly, Popat et al. [171] established general guidelines for developing a neural network model for automatically detecting and classifying defects in hot strips, achieving a classification accuracy of 98.75%.In the steel surface defect inspection realm, ANN has been widely utilized by various researchers [172,173].

Support Vector Machines
Support vector machines (SVMs), also known as support vector networks, are supervised machine learning algorithms capable of classification and regression tasks [174].The SVM model was introduced in 1995 by Vapnik et al. [175].Deep learning models often face the challenge of overfitting due to the many parameters involved.SVMs offer a solution to this issue.Their promising empirical performance and appealing characteristics have attracted significant attention from researchers.In recent years, SVMs have been widely employed in classifying steel surface defects, yielding excellent performance [176][177][178].An adaptive classifier with a Bayes kernel (BYEC) was proposed in [179].That approach involved the introduction of five types of features for the loss problem, followed by the proposal of a series of SVMs.Subsequently, a Bayes classifier was trained as an evolutionary kernel to fuse the results from the base SVMs.Additionally, in [180], an evolutionary classifier with a Bayes kernel (BTEC) was developed for classifying steel surface defects.Furthermore, twin support vector machines with multi-information (MTSVMs) were introduced in [181] for classifying defects in steel surfaces.Chu proposed a multi-class classifier called ELSGTWSVM in [182] for classifying surface defects in strip steel.The classifier was designed to categorize six classes of strip steel surface defects, including cracks, scarring, holes, wrinkles, scratches, and scales.Moreover, Hu et al. [183] proposed an SVM-based model to categorize surface defects on steel strips using grayscale, geometric, and shape features extracted by combining the defect target image and its corresponding binary image.Additionally, Song et al. [184] applied a scattering operator to extract features for defect recognition, enhancing the tolerance ability of local deformations.The improved scattering convolution network achieved a best average recognition accuracy of 98.60% using the SVM classifier.Furthermore, Hu et al. [185] constructed a classification model based on a hybrid chromosome genetic algorithm to classify surface defects on steel strips using shape, geometric, grayscale, and texture features extracted from defect target images and their corresponding binary images.The SVM classifier achieved an average prediction accuracy of 95.04%.

Discriminant Function
The discriminant function (DF) is a classification method used to discriminate between two or more naturally occurring groups [186,187].The discriminant function procedure can be divided into two phases.First, it tests the significance of a set of discriminant functions, then it performs classification [188].Generally, discriminant analysis is a highly beneficial technique that involves two key steps.Firstly, detecting the variables, and secondly, accurately classifying cases into different classes.It has been applied widely in many fields.Two notable studies on the theory behind discriminant functions have been conducted [1,2].These studies are valuable resources for researchers interested in the discriminant function theory.Weiwei et al. [189] demonstrated that the discriminant function model was appropriate for feature components extracted from steel surface defects.A classification approach was described in Cord [190] that was based on textural information for metallic surfaces exhibiting intricate random patterns.

Fuzzy Logic
The concept of fuzzy logic, pioneered by Zadeh in 1956 [191,192], offers a unique approach to problem-solving by incorporating linguistic expressions rather than solely relying on numerical values.Neural networks, with their capacity to process intricate and imprecise data, excel at detecting and extracting patterns that may be too intricate for conventional systems.Shitole et al. conducted a study [193] where they developed a neuralfuzzy classifier specifically for detecting, classifying, and interpreting weld defects.This classifier was compared against two other classification methods: a fuzzy logic classifier and an artificial neural network classifier.

Deep Learning
Deep learning has revolutionized the steel surface defect inspection field by enabling computers to learn from large numbers of images without explicitly programmed instructions [194][195][196].It involves training artificial neural networks with multiple layers to recognize patterns on the steel surfaces and make decisions.Guan et al. [197] proposed an advanced deep learning algorithm for classifying steel surface defects by incorporating feature visualization and quality evaluation techniques.They utilized the pre-trained VGG19 model for defect classification and employed DVGG19 to extract image features.Decision trees and the Structural Similarity Index (SSIM) were employed to adjust VGG19's parameters and structure and evaluate the feature image quality.Their VSD network achieved an impressive total accuracy of 93.70%.Kostenetskiy et al. [198] developed a prototype system for Iron-and-Steel Works in the Russian Federation, leveraging convolutional neural networks (CNNs) to automatically detect and classify defects with a classification accuracy of 98.10% on test data comprising six defect types.Additionally, a defect detection system based on deep learning was proposed in [5], achieving an mAP of 82.30 for detection and a 99.70% accuracy for classification tasks by fusing multilevel features.Konovalenko and Maruschak [199] utilized the ResNet50 neural network classifier to detect and classify three types of technical defects in rolled-metal products, achieving a total accuracy of 95.80%.Wu et al. [200] proposed a method that combined feature transformation, extraction, and nearest neighbors to classify steel surface defects using Residual Net, MobileNet, and Dense Net networks, achieving a classification accuracy of 92.33%.Fu et al. [201] introduced a deep neural network for recognizing and classifying new defect classes on steel surfaces, utilizing a Siamese network and achieving classification accuracies of 85.80% and 100% on the NEU and Xsteel datasets, respectively.They employed a pre-trained SqueezeNet model with a multi-receptive field module to emphasize low-level feature learning.Additionally, Zhou et al. [202] proposed a novel method for surface defect detection based on a bilinear model using the Double-Visual Geometry Group16 feature function, achieving an accurate classification and localization of surface defects.Furthermore, Fu et al. [203] proposed an approach that combined a pre-trained VGG16 model for feature extraction and a CNN for classification, while Wang et al. [204] combined object detection and binary classification models to enhance the accuracy and speed of steel plate surface defect detection, achieving an accuracy of 98.20%.Lastly, Nagy et al. [205] addressed the challenges of adapting existing models to new artifact classes by combining EfficientNet deep neural networks with randomized classifiers.Masci et al. [206] introduced the Max-Pooling Convolutional Neural Network (MPCNN), which achieved a 7% error rate in classifying seven defects in cold strips, outperforming SVM.Similarly, Yi et al. [207] presented a simple CNN model for classifying steel sheet surface defects.Meanwhile, Deshpande et al. [208] proposed a Siamese CNN for one-shot defect recognition on steel surfaces.Prihatno et al. [209] employed CNNs to detect defects in steel sheets, achieving 96.00% and 73.00% accuracy in training and testing, respectively.Ibrahim and Tapamo [210] combined a pre-trained VGG16 model as a feature extractor with a newly designed CNN classifier to address the challenges of diversity and similarity among defect types.He et al. [211] introduced a novel framework leveraging multi-scale receptive field to extract multi-scale features.In their approach, a set of autoencoders were trained to reduce the dimensionality of the extracted multi-scale features, achieving a classification rate of 97.20%.Niu et al. [212] introduced the surface defect-generation adversarial network (SDGAN), a novel method based on GANs for defect image generation using a large number of defect-free images collected from industrial sites.

Unsupervised Classifiers
Unsupervised classifiers are techniques used to select which pixels are related and group them into classes.The computer model receives unlabeled data without explicit instructions on what to do with it, and the model can learn to form its own classifications of the training data without external help.Recently, unsupervised methods have been used in many studies for surface defects' inspection [213].

Self-Organizing Map
The self-organizing map (SOM), invented by Kohonen [214], is an artificial neural network extensively employed for clustering and visualization in exploratory data analysis [215,216].SOMs serve the purpose of reducing a complex, high-dimensional input space into a simpler, low-dimensional representation [217], finding applications across various fields [218].In [219], an inspection system for rolled steel was proposed, using PCA for feature extraction and SOMs for defect classification.The study targeted three defects: exfoliation, oxidation, and waveform defect, achieving an overall accuracy of 87.00%.In another effort by Luo and He [220], an automatic optical inspection system for hot-rolled flat steel was developed to enhance inspection speed, implemented on an FPGA and reaching an accuracy of 92.11%.Additionally, Caleb [116] discussed two adaptive computing techniques based on supervised and unsupervised learning to establish a reliable decision support system for classifying surface defects on hot-rolled steel.

Learning Vector Quantizer
Learning Vector Quantization (LVQ) is an artificial neural network algorithm versatilely applicable in both supervised and unsupervised learning scenarios for classification tasks [221].It proves effective for handling variable-length and warped feature sequences and is capable of fine-tuning prototype sequences to achieve optimal class separation.Guifang [75] proposed a method for defect recognition in hot-rolled strips, employing the LVQ neural network fed with 54 features to accomplish surface defect recognition.

Deep Autoencoder Network
A deep autoencoder (DAN) is a type of unsupervised neural network characterized by its three-layer structure, where the output target of the autoencoder matches the input data.Consisting of two main components, namely the encoder network and the decoder network, it operates by transforming input data from a high-dimensional space into codes in a lower-dimensional space through the encoder network, and then reconstructing the inputs from the corresponding codes via the decoder network.

Semi-Supervised Classifier
Semi-supervised models are the mixing of supervised and unsupervised learning processes that can use fairly small datasets, and the training dataset can include both labeled and unlabeled data for training.It enhances classification performance when not requiring sample-labeled samples for training.Di et al. [222] introduced a novel approach combining CAE with semi-supervised generative adversarial networks (SGANs) for steel surface defect detection and classification.Initially, CAE-SGAN leverages a stacked CAE trained on a vast quantity of unlabeled data, utilizing intermediate layers to enhance feature extraction for fine-grained features.Subsequently, the encoder network of the CAE is retained as a feature extractor and connected to a softmax layer to establish a new classifier.SGAN is then employed for semi-supervised learning to refine the classification further.Odena et al. [223] proposed a semi-supervised approach based on GANs, which coerces the discriminator network to output class labels.Moreover, He et al. [224] presented a defect classification method based on semi-supervised learning using two networks: a residual network and a categorized generative adversarial network.By employing GANs to generate many unlabeled samples, they achieved a classification accuracy of 99.56%.Additionally, semi-supervised learning was utilized as a feature extraction method in another study [225].The strengths and limitations of classification methods for steel surface defects are summarized in Table 6.

Comparison of Some Defect Classification Methods
Numerous studies have focused on classifying surface defects in steel.Tables 7-9 present comparisons of various methods in the realms of supervised, unsupervised, and semi-supervised learning, respectively.The scale of the training set is limited in early production

Evaluation Metrics of Defect Detection and Classification Methods
Evaluation metrics play a crucial role in assessing the performance of defect detection methods.Commonly used evaluation metrics in steel surface defect detection include accuracy, precision, recall, and F1-score [239][240][241][242][243].These metrics provide valuable insights into the effectiveness of the proposed methods by quantifying their ability to identify and classify defects on steel surfaces accurately.This paper summarizes some of the evaluation metrics commonly used in steel surface defect detection.Their formulas are given in Equations ( 13)- (16).
where TP represents the number of positive samples correctly detected or classified, while TN indicates negative samples correctly identified as negative.FP occurs when negative samples are erroneously detected or classified as positive, and FN occurs when positive samples are incorrectly detected or classified as negative.

Literature Analysis of Detection Methods
Based on the review of steel surface defect detection methods, we can observe that there has been a consistent increase in the number of papers focusing on defect detection methods over the years.
There is a general trend of decrease in statistical methods over time.This could be attributed to the recent popularity of more sophisticated machine learning techniques.As for spectral methods, they experienced fluctuation over the years, with a peak around 2006-2010 before gradually declining in usage.Similarly, the texture segmentation based methods show a peak around 2006-2010 before declining in usage.As shown in Figure 7, the most notable trend is the significant increase in methods based on learning, particularly in the most recent period 2021-2024.This suggests a growing interest in and adoption of machine learning techniques for steel surface defect detection.Therefore, the analysis suggests a shift from traditional statistical and texture segmentation based methods towards more advanced machine learning-based approaches for steel surface defect detection in recent years.This shift is likely driven by the increasing data availability, advancements in machine learning algorithms, and the desire for more accurate and efficient defect detection systems in industrial settings.

Literature Analysis of Classification Methods
From Figure 8, it can be seen that there has been a clear upward trend in the usage of supervised methods over the years, with a notable increase in the number of papers from 2016 to 2020.Unsupervised methods have been less common than supervised methods, with sporadic utilization over the years.
The number of papers focusing on unsupervised methods is relatively low across all periods.Semi-supervised methods have seen some adoption, particularly from 2016 to 2020, where there is a notable increase in the number of papers compared to previous years.However, there was also a notable increase in the exploration of unsupervised and semi-supervised methods during that period.The analysis suggests a current focus on supervised methods for steel surface defect classification, with a smaller proportion of papers focusing on unsupervised and semi-supervised methods.This trend indicates a preference for methods that depend on labeled data.

Conclusions and Future Directions
This work provided a comprehensive overview of techniques employed in detecting and classifying defects on steel surfaces based on more than 200 studies.The study examined existing methods, particularly machine learning approaches, to discover the latest advancements and progress in automating steel surface inspection.Various factors, including grease, dirt, impurities, temperature changes, and high humidity, can cause defects on steel surfaces.Additionally, these defect categories have some diversity and similarities, presenting challenges for their accurate classification.It was noted that recent methods in detecting and classifying steel surface defects emphasized supervised learning due to its superior performance, resulting in a relative neglect of statistical and spectral methods.Nevertheless, there are observations from previous research that contribute to drawing justifiable conclusions in the process of detecting and classifying.Therefore, we emphasize these observations by presenting key points that we consider highly significant:

•
A standard image dataset must be used to conduct a performance evaluation of detection and classification algorithms and carry out a fair comparative analysis.

•
Although requiring fewer labeled datasets, semi-supervised learning methods have exhibited lower accuracies than supervised learning methods.

•
The significant diversity and similarity of various classes of defects make defect classification difficult.

•
Augmenting data enhances the performance of defect detection and classification models, particularly on unevenly distributed datasets that are not too large.
This paper also discussed evaluation metrics commonly used in steel surface defect detection, including accuracy, precision, recall, and F1-score, which play a crucial role in assessing the performance of defect detection methods by quantifying their ability to detect and accurately classify defects on steel surfaces.There are several encouraging future research studies that could enhance the performance of defect detection and classification methods.Among them are the following: 1.
There is a need for curating large-scale benchmark datasets comprising diverse steel product types and defect categories and standardized evaluation metrics, including accuracy, precision, recall, and F1-score metrics.

2.
Exploring the integration of advanced machine learning models, such as transformerbased models, graph neural networks, and reinforcement learning, to further improve the accuracy and efficiency of defect detection systems.

3.
Ensuring the robustness and generalization of machine learning models across different production environments and steel product types remains a challenge.Future research could investigate methods for enhancing defect detection systems' robustness and generalization capabilities, such as transfer learning and data augmentation strategies.

4.
Much expectation for future findings for steel surface defect detection and classification points to investigating models based on supervised learning, specifically deep learning techniques and semi-supervised learning methods, to overcome the challenge of limited training data.This shift toward deep learning techniques is expected to improve outcomes in detecting steel-strip defects in the near future.5.
One of the problems that can be encountered when designing a steel surface defect detection model is the eventuality of small datasets that often lack diversity.The consequence is the design of biased models that fail to appropriately generalize well to unseen data.This is due to the fact that there is not enough variation in the types, sizes, shapes, and locations of defects present in the dataset.Some of the directions that could be explored to resolve these problems are: • Augmenting the dataset through usual techniques such as rotation, flipping, cropping, and color adjustments might not be sufficient to increase dataset size and diversity.It is then crucial to find novel data augmentation strategies particularly adapted to steel surface defect detection.• Designing effective feature representations that capture the relevant characteristics of defects on steel surfaces is crucial.It is a challenge to find suitable features that generalize well and are robust to noise and variations.

•
Exploring models that can generalize well from small datasets to unseen data is a significant challenge.Regularization techniques, transfer learning, and domain adaptation methods can help improve generalization performance but require careful adaptation and tuning.• It will be worth investigating the integration of human expertise into the training and validation process to help improve model performance in the presence of small datasets, where human feedback is leveraged to refine and validate defect detection models.

•
The development of benchmark datasets specifically tailored to defect detection on steel surfaces can facilitate the evaluation and comparison of different algorithms and methodologies.
It is observed that supervised learning algorithms have a high likelihood to emerge as the mainstream approaches for steel defect detection and classification, coupled with the challenge of enhancing model accuracy with limited datasets, most future works should focus on designing an accurate supervised learning-based model with low computational complexity for steel surface defect detection and classification.

Figure 1 .
Figure 1.The elements of the image acquisition unit.

Figure 2 .
Figure 2. Defects detection and classification tasks.(a) Bounding box around detected defect.(b) class identification of each detected defect.

Figure 3 .
Figure 3. Categories of steel products.

Figure 7 .
Figure 7. Statistical distribution of detection methods.

Figure 8 .
Figure 8. Statistical distribution of classification methods.

Table 1 .
Strengths and limitations of different statistical detection methods for steel surface defects.

Table 2 .
Advantages and disadvantages of different spectral detection methods for steel surface defects.
Spatial filterIts text-based approach is more centralized (the text file segment is separated from the image segmentation)

Table 3 .
Strengths and limitations of texture segmentation-based Methods for steel surface defects.

Table 4 .
Strengths and weaknesses of different machine learning methods for steel surface defects' detection.

Table 5 .
Comparison of some defects detection methods.

Table 6 .
Strengths and limitations of classification methods of steel surface defects.

Table 7 .
Comparison of some defect classification methods (supervised classifiers).

Table 8 .
Comparison of some defects classification methods (unsupervised classifiers).

Table 9 .
Comparison of some defects classification methods (semi-supervised classifiers).