Wood Defect Detection Based on Depth Extreme Learning Machine

The deep learning feature extraction method and extreme learning machine (ELM) classification method are combined to establish a depth extreme learning machine model for wood image defect detection. The convolution neural network (CNN) algorithm alone tends to provide inaccurate defect locations, incomplete defect contour and boundary information, and inaccurate recognition of defect types. The nonsubsampled shearlet transform (NSST) is used here to preprocess the wood images, which reduces the complexity and computation of the image processing. CNN is then applied to manage the deep algorithm design of the wood images. The simple linear iterative clustering algorithm is used to improve the initial model; the obtained image features are used as ELM classification inputs. ELM has faster training speed and stronger generalization ability than other similar neural networks, but the random selection of input weights and thresholds degrades the classification accuracy. A genetic algorithm is used here to optimize the initial parameters of the ELM to stabilize the network classification performance. The depth extreme learning machine can extract high-level abstract information from the data, does not require iterative adjustment of the network weights, has high calculation efficiency, and allows CNN to effectively extract the wood defect contour. The distributed input data feature is automatically expressed in layer form by deep learning pre-training. The wood defect recognition accuracy reached 96.72% in a test time of only 187 ms.


Introduction
Today's wood products are manufactured under increasingly stringent requirements for surface processing. In developed countries such as Sweden and Finland with developed forest resources, the comprehensive use rate of wood is as high as 90%. In sharp contrast, the comprehensive use rate of wood in China is less than 60%, causing a serious waste of resources. With China's rapid economic development, people are increasingly pursuing a high-quality life, which will inevitably lead to an increase in demand for wood and wood products, such as solid wood panels, wood-based panels, paper and cardboard, and other consumption levels are among the highest in the world. The existing wood storage capacity and processing level make it difficult to meet the rapid growth demand. The lack of wood supply and the low use rate have led to the limited development of China's wood industry. Therefore, it is necessary to comprehensively inspect the processing quality of logs and boards to improve the use rate of wood and the quality of wood products.
The nondestructive testing of wood can accurately and quickly make judgments on the physical properties and growth defects in wood, and nondestructive wood testing and automation can be realized. In recent years, the combined application of computer technology along with detection and control theory has made great progress in the detection of wood defects. In the nondestructive testing of wood surfaces, commonly used traditional methods include laser testing [1,2], ultrasonic testing [3][4][5], acoustic emission technology [6,7], etc. Computer-aided techniques are a common approach to surface processing, as they are efficient and have a generally high recognition rate [8,9]. Deep learning was first proposed by Hinton in 2006; in 2012, scholars adopted an AlexNet network based on deep learning to achieve computer vision recognition accuracy of up to 84.7%. Deep learning prevents dimensionality in layer initialization and represents a revolutionary development in the field of machine learning [10,11]. More and more scholars are applying deep learning networks in wood nondestructive testing. He [12] et al. used a linear array CCD camera to obtain wood surface images, and proposed a hybrid total convolution neural network (Mix-FCN) for the recognition and location of wood defects; however, the network depth was too deep and required too much calculation. Hu [13] and Shi [14] used the Mask R-CNN algorithm in wood defect recognition, but they used a combination of multiple feature extraction methods, which resulted in a very complex model. However, the current deep learning algorithms still have problems such as inaccurate defect location, incomplete defect contour and boundary information in the wood defect detection process. To solve the above problems and effectively meet the needs of wood processing enterprises for wood testing, we carry out the research of this article.
The innovations of this article are: (1) Simple pre-processing of wood images using nonsubsampled shear wave transform (NSST), reducing the complexity and computational complexity of image processing, as the input of convolutional neural network; (2) application of a simple linear iterative clustering (SLIC) algorithm to enhance and improve the convolutional neural network to obtain a super pixel image with a more complete boundary contour; (3) use of genetic algorithm to improve extreme learning machine and classified the obtained image features. Through the above method, the accuracy of defect detection is improved, and the recognition time is truncated to establish an innovative machine-vision-based wood testing technique.

Wood Defect Original Image Dataset
According to the different processes and causes of solid wood board defects, they are divided into biohazard defects, growth defects and processing defects. Among them, growth defects and biohazard defects are natural defects, which have certain shape and structure characteristics, and are also an important basis for wood grade classification. Generally speaking, solid wood board growth defects and biohazard defects can be divided into: dead knots, live knots, worm holes, decay, etc. The original data set used in the experiment in this article is derived from the wood sampling image in the 948 project of the State Forestry Administration (the introduction of the laser profile and color integrated scanning technology for solid wood panels). When scanning to obtain wood images, the scanning speed of the scanner is 170 Hz-5000 Hz; Z direction resolution is 0.055 mm-0.200 mm; X direction resolution is 0.2755 mm-0.550 mm; and color pixel resolution can reach 1 mm × 0.5 mm. The data set includes 5000 defect maps of pine, fir, and ash. The bit depth of each image is 24, and the specified size is at the 100*100 pixel level. Part of the defect image is shown in Figure 1.

Optimized Convolution Neural Network
This paper proposes an optimized algorithm which uses NSST to preprocess the images followed by the CNN to extract defect features from wood images as a preliminary CNN model. The simple linear iterative clustering (SLIC) super-pixel segmentation algorithm is used to analyze the wood images by super-pixel clustering, which allows the defects in wood images and local information regarding defects and cracks to be efficiently located. The obtained information is fed back to the initial model, which enhances the original CNN.

Structure and Characteristics of Convolution Neural Networks
The CNN is an artificial neural network algorithm with multi-layer trainable architecture [15]. It generally consists of an input layer, excitation layer, pool layer, convolution layer, and full connection layer. CNNs have many advantages in terms of image processing applications. (1) Feature extraction and classification can be combined into the same network structure and synchronized training can be achieved, and the algorithm is fully adaptive. (2) When the image size is larger, the deep feature information can be extracted better. (3) Its unique network structure has strong adaptability to the local deformation, image rotation, image translation, and other changes in the input image. In this study, each pixel in the wood image was convoluted and the defect feature was extracted by exploiting these CNN characteristics. The CNN network skeleton used in this article is shown in Figure 2.

Optimized Convolution Neural Network
This paper proposes an optimized algorithm which uses NSST to preprocess the images followed by the CNN to extract defect features from wood images as a preliminary CNN model. The simple linear iterative clustering (SLIC) super-pixel segmentation algorithm is used to analyze the wood images by super-pixel clustering, which allows the defects in wood images and local information regarding defects and cracks to be efficiently located. The obtained information is fed back to the initial model, which enhances the original CNN.

Structure and Characteristics of Convolution Neural Networks
The CNN is an artificial neural network algorithm with multi-layer trainable architecture [15]. It generally consists of an input layer, excitation layer, pool layer, convolution layer, and full connection layer. CNNs have many advantages in terms of image processing applications. (1) Feature extraction and classification can be combined into the same network structure and synchronized training can be achieved, and the algorithm is fully adaptive. (2) When the image size is larger, the deep feature information can be extracted better. (3) Its unique network structure has strong adaptability to the local deformation, image rotation, image translation, and other changes in the input image. In this study, each pixel in the wood image was convoluted and the defect feature was extracted by exploiting these CNN characteristics. The CNN network skeleton used in this article is shown in Figure 2.

Optimized Convolution Neural Network
This paper proposes an optimized algorithm which uses NSST to preprocess the images followed by the CNN to extract defect features from wood images as a preliminary CNN model. The simple linear iterative clustering (SLIC) super-pixel segmentation algorithm is used to analyze the wood images by super-pixel clustering, which allows the defects in wood images and local information regarding defects and cracks to be efficiently located. The obtained information is fed back to the initial model, which enhances the original CNN.

Structure and Characteristics of Convolution Neural Networks
The CNN is an artificial neural network algorithm with multi-layer trainable architecture [15]. It generally consists of an input layer, excitation layer, pool layer, convolution layer, and full connection layer. CNNs have many advantages in terms of image processing applications. (1) Feature extraction and classification can be combined into the same network structure and synchronized training can be achieved, and the algorithm is fully adaptive. (2) When the image size is larger, the deep feature information can be extracted better. (3) Its unique network structure has strong adaptability to the local deformation, image rotation, image translation, and other changes in the input image. In this study, each pixel in the wood image was convoluted and the defect feature was extracted by exploiting these CNN characteristics. The CNN network skeleton used in this article is shown in Figure 2.

Non-Subsampled Shearlet Transform (NSST)
The NSST can represent signals sparsely and optimally, but it also has a strong directionsensitivity [16][17][18]. Therefore, using NSST to preprocess wood images can preserve the defects feature of wood images. Redundancies in the wood image information are reduced in addition to the complexity and computation of image processing with the depth learning method.

Simple Linear Iterative Clustering (SLIC)
The CNN uses a matrix form to represent an image to be processed, so the spatial organization relationship between pixels is not considered-this affects the image segmentation and obscures the boundary of the defective region of the wood image. The SLIC algorithm can generate relatively compact super-pixel image blocks after processing a gray or color image. The generated super-pixel image is compact between pixels, and the edge contour of the image is clear. To this effect, the SLIC extracts a relatively accurate contour to supplement the feature contour. The SLIC also works with relatively few initial parameters. Only the number of hyper pixels needed to segment the image must be set. The algorithm is simple in principle, and has a small calculation range and rapid running speed. By 2015, the parallel execution speed had reached 250 FPS; it is now the fastest super-pixel segmentation method available [19].

Feature Extraction
The optimized CNN model proposed in this paper was designed for wood surface feature extraction. Knots were used as example wood defects ( Figure 3a) to test feature extraction via the following operations. The input image Figure 3a was directly processed by CNN algorithm to obtain image Figure 3b, which presented local irregularity and nonsmooth edges in the contour after enlargement [20]. The SLIC algorithm was used to process the input image ( Figure 3a) followed by longitudinal convolution (Figure 3d). The image shown in Figure 3h was obtained after edge removal and fusion processing. The defect contour features of Figure 3h are substantially clearer compared to Figure 3b because the segmentation of wood images using CNN, which is expressed in pixels as a matrix without considering the spatial organization relationship between pixels, affects the end image segmentation results. The SLIC algorithm instead extracts the wood defect boundary and contour information from the original image and feeds back the information to the initial segmentation results of the CNN model.
The above process reduces the redundancy of local image information in addition to the complexity and computation of the image processing. The pixel-level CNN model method does not accurately reveal the boundary of the defective region of wood image, but instead indicates only its general position. SLIC can extract a relatively accurate contour to supplement it and optimize the initial CNN model. To this effect, the proposed SLIC algorithm-based method improves the defect feature extraction of wood images over CNN alone.
The input image (Figure 3a) was processed by the NSST algorithm to obtain the image shown in Figure 3e, then Figure 3f was obtained using the SLIC algorithm. Vertical convolution was carried out to obtain the image shown in Figure 3g. Figure 3i was obtained after edge removal and fusion processing. Consider Figure 3i compared to Figure 3h: although the wood defect contour feature extraction effects are not obvious, using NSST to preprocess the image reduces environmental interference and training depth to markedly decrease the computation and complexity of the image processing.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 14 Based on the above analysis, this paper decided to use the wood image processing frame shown in Figure 4 to obtain the wood defect feature map.

Extreme Learning Machine (ELM)
Depth learning is commonly used in target recognition and defect detection applications due to its excellent feature extraction capability. The CNN is highly time-consuming due to the necessity of iterative pre-training and fine-tuning stages; the hardware requirements for more complex engineering applications are also high. The deep CNN structure has a large quantity of adjustable free parameters, which makes its construction highly flexible. On the other hand, it lacks theoretical guidance and is overly reliant on experience, so its generalization performance is dubious. In this study, we integrated the ELM into a depth extreme learning machine ( Figure 5) to improve the training efficiency of the deep convolution network. The proposed method extracts wood defects by using an optimized CNN and ELM classifier to exploit the excellent feature extraction ability of the deep network and fast training of ELM simultaneously.
The ELM algorithm differs from traditional pre-feedback neural network training learning. Its hidden layer does not need to be iterated, and input weights and hidden layer node biases are set randomly to minimize training error. The output weights of the hidden layer are determined by the algorithm [21][22][23]. The ultimate learning machine is based on the proved ordinary extreme theorem and interpolation theorem, under which when the hidden layer activation function of a single hidden layer feedforward neural network is infinitely differentiable, its learning ability is independent of the hidden layer parameters and is only related to the current network structure. When the input weights Based on the above analysis, this paper decided to use the wood image processing frame shown in Figure 4 to obtain the wood defect feature map.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 14 Based on the above analysis, this paper decided to use the wood image processing frame shown in Figure 4 to obtain the wood defect feature map.

Extreme Learning Machine (ELM)
Depth learning is commonly used in target recognition and defect detection applications due to its excellent feature extraction capability. The CNN is highly time-consuming due to the necessity of iterative pre-training and fine-tuning stages; the hardware requirements for more complex engineering applications are also high. The deep CNN structure has a large quantity of adjustable free parameters, which makes its construction highly flexible. On the other hand, it lacks theoretical guidance and is overly reliant on experience, so its generalization performance is dubious. In this study, we integrated the ELM into a depth extreme learning machine ( Figure 5) to improve the training efficiency of the deep convolution network. The proposed method extracts wood defects by using an optimized CNN and ELM classifier to exploit the excellent feature extraction ability of the deep network and fast training of ELM simultaneously.
The ELM algorithm differs from traditional pre-feedback neural network training learning. Its hidden layer does not need to be iterated, and input weights and hidden layer node biases are set randomly to minimize training error. The output weights of the hidden layer are determined by the algorithm [21][22][23]. The ultimate learning machine is based on the proved ordinary extreme theorem and interpolation theorem, under which when the hidden layer activation function of a single hidden layer feedforward neural network is infinitely differentiable, its learning ability is independent of the hidden layer parameters and is only related to the current network structure. When the input weights

Extreme Learning Machine (ELM)
Depth learning is commonly used in target recognition and defect detection applications due to its excellent feature extraction capability. The CNN is highly time-consuming due to the necessity of iterative pre-training and fine-tuning stages; the hardware requirements for more complex engineering applications are also high. The deep CNN structure has a large quantity of adjustable free parameters, which makes its construction highly flexible. On the other hand, it lacks theoretical guidance and is overly reliant on experience, so its generalization performance is dubious. In this study, we integrated the ELM into a depth extreme learning machine ( Figure 5) to improve the training efficiency of the deep convolution network. The proposed method extracts wood defects by using an optimized CNN and ELM classifier to exploit the excellent feature extraction ability of the deep network and fast training of ELM simultaneously.
The ELM algorithm differs from traditional pre-feedback neural network training learning. Its hidden layer does not need to be iterated, and input weights and hidden layer node biases are set randomly to minimize training error. The output weights of the hidden layer are determined by the algorithm [21][22][23]. The ultimate learning machine is based on the proved ordinary extreme theorem and interpolation theorem, under which when the hidden layer activation function of a single hidden layer feedforward neural network is infinitely differentiable, its learning ability is independent of the hidden layer parameters and is only related to the current network structure. When the input weights and hidden layer node offsets are randomly assigned to obtain the appropriate network structure, the ELM has universal approximation capability. The network input weights and hidden layer node offsets can be randomly assigned by approximating any continuous function. Under the premise of network hidden layer activation function infinite differentiability, the output weights of the network can be calculated via the least square method. The network model that can approximate the function can be established, and the corresponding neural network functions such as classification, regression, and fitting can be realized.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 14 and hidden layer node offsets are randomly assigned to obtain the appropriate network structure, the ELM has universal approximation capability. The network input weights and hidden layer node offsets can be randomly assigned by approximating any continuous function. Under the premise of network hidden layer activation function infinite differentiability, the output weights of the network can be calculated via the least square method. The network model that can approximate the function can be established, and the corresponding neural network functions such as classification, regression, and fitting can be realized. This paper mainly centers on the classification function of ELM, which serves to select a relatively simple single hidden layer neural network as the classifier. The traditional neural network algorithm needs many iterations and parameters, learns slowly, has poor expansibility, and requires intensive manual interventions. The ELM used here requires no iterations, the learning speed is relatively fast, the input weights and biases are generated randomly, the follow-up does not need to be set, and relatively few manual interventions are required. In the large sample database, the recognition rate of ELM is better than that of the support vector machine (SVM). For these reasons, we use ELMs as classifiers to enhance recognition efficiency and performance [24,25].
The ELM algorithm introduced above is the main classification method for wood defect feature recognition in this paper. However, in the ELM network structure, the input weights and the threshold of hidden layer nodes are given randomly. For the ELM structure with the same number of hidden layer neurons, the performance of the network is very different, which makes the classification performance of the network more unstable. The genetic algorithm (GA) simulates Darwinian evolutionary theory to optimize the initial weights and threshold of ELM by eliminating less-fit weights and thresholds. Figure 6a,c show the variation curves of GA-ELM population fitness functions under Radbas, Hardlim, and Sigmoid excitation functions, respectively. Smaller fitness values indicate higher accuracy; the Sigmoid incentive function has the best network effect. This paper mainly centers on the classification function of ELM, which serves to select a relatively simple single hidden layer neural network as the classifier. The traditional neural network algorithm needs many iterations and parameters, learns slowly, has poor expansibility, and requires intensive manual interventions. The ELM used here requires no iterations, the learning speed is relatively fast, the input weights and biases are generated randomly, the follow-up does not need to be set, and relatively few manual interventions are required. In the large sample database, the recognition rate of ELM is better than that of the support vector machine (SVM). For these reasons, we use ELMs as classifiers to enhance recognition efficiency and performance [24,25].
The ELM algorithm introduced above is the main classification method for wood defect feature recognition in this paper. However, in the ELM network structure, the input weights and the threshold of hidden layer nodes are given randomly. For the ELM structure with the same number of hidden layer neurons, the performance of the network is very different, which makes the classification performance of the network more unstable. The genetic algorithm (GA) simulates Darwinian evolutionary theory to optimize the initial weights and threshold of ELM by eliminating less-fit weights and thresholds.   The classification accuracy of GA-ELM and ELM under different excitation functions is also shown in Table 1. We found that the classification accuracy of Sigmoid and Radbas excitations were similar, and the Hardlim excitation function was an exception. The accuracy of ELM and GA-ELM was highest when the Sigmoid function was used as the activation function. The accuracy of GA-ELM reached 95.93%, which is markedly better than that of an unoptimized ELM. The GA optimized ELM network required fewer hidden layer nodes and showed higher test accuracy as well. In summary, an improved depth extreme learning machine was constructed in this study by combining the optimized GA-ELM classifier with the optimized CNN feature extraction. It is referred to from here on as "D-ELM". Table 2 shows the computer-related parameters and software platform used by the experimental system, including CPU model, main frequency and memory size.

Empirical Method and Result
The specific experimental process is shown in Figure 7. First, we preprocessed 5000 original images via NSST and randomly selected 4000 images for training. Second, for each pixel in each image, the neighborhood subgraph was taken as the input of CNN, and a total of 40,000,000 samples were obtained as the experimental training set to train the network model. The remaining 10,000,000 samples were used as test images to evaluate the algorithm. The features extracted from the test samples were input into the ELM network classifier, the number of hidden layer nodes of the extreme learning machine was set to 100, then the accuracy and stability of the feature extraction method was statistically analyzed. We found that when the number of iterations exceeds 3500, the loss function is around 0.2 and the convergence performance is acceptable.  Although the training loss value fluctuates a little during the iteration process, it shows a downward trend as a whole. When the iteration is completed, the training loss value was around 0.2. Figure 8b shows a graph of accuracy. When the number of iterations was 1500, the accuracy of the proposed algorithm reached 90%. Accuracy continued to increase as iteration quantity increased until reaching a maximum of about 98%.  Although the training loss value fluctuates a little during the iteration process, it shows a downward trend as a whole. When the iteration is completed, the training loss value was around 0.2. Figure 8b shows a graph of accuracy. When the number of iterations was 1500, the accuracy of the proposed algorithm reached 90%. Accuracy continued to increase as iteration quantity increased until reaching a maximum of about 98%.  Figure 8a shows the relationship between the training loss value and the number of iterations. Although the training loss value fluctuates a little during the iteration process, it shows a downward trend as a whole. When the iteration is completed, the training loss value was around 0.2. Figure 8b shows a graph of accuracy. When the number of iterations was 1500, the accuracy of the proposed algorithm reached 90%. Accuracy continued to increase as iteration quantity increased until reaching a maximum of about 98%. (a)

Discussion
This paper proposes an ELM classifier based on depth structure. Choosing the appropriate number of hidden nodes under the D-ELM structure provides enhanced stability and generalization ability in the network. To ensure accurate tests and prevent node redundancy, when the number of hidden nodes was 100, the test accuracy of D-ELM was maintained at a relatively stable value over repeated tests. The accuracy was phased as shown in Figure 10. D-ELM significantly outperformed ELM with small fluctuations in amplitude, robustness to the number of test iterations, and higher network stability.   Figure 9 shows our final recognition effect on the test set. We surround the identified wood defects with different colored rectangular borders

Discussion
This paper proposes an ELM classifier based on depth structure. Choosing the appropriate number of hidden nodes under the D-ELM structure provides enhanced stability and generalization ability in the network. To ensure accurate tests and prevent node redundancy, when the number of hidden nodes was 100, the test accuracy of D-ELM was maintained at a relatively stable value over repeated tests. The accuracy was phased as shown in Figure 10. D-ELM significantly outperformed ELM with small fluctuations in amplitude, robustness to the number of test iterations, and higher network stability.

Discussion
This paper proposes an ELM classifier based on depth structure. Choosing the appropriate number of hidden nodes under the D-ELM structure provides enhanced stability and generalization ability in the network. To ensure accurate tests and prevent node redundancy, when the number of hidden nodes was 100, the test accuracy of D-ELM was maintained at a relatively stable value over repeated tests. The accuracy was phased as shown in Figure 10. D-ELM significantly outperformed ELM with small fluctuations in amplitude, robustness to the number of test iterations, and higher network stability.  Table 3 shows the results of our algorithm accuracy tests, as mentioned above. D-ELM has a higher average accuracy rate but lower standard deviation than ELM. The accuracy and stability of D-ELM network were both accurate and stable. As a result, the performance of the classifier was improved. We added an SVM classifier to the experiment to further assess the depth extreme learning machine. Table 4 shows the accuracy and timing of D-ELM and SVM training tests on all samples, where D-ELM again has the highest accuracy in both training and testing. Although the training time and network layer quantity are higher in D-ELM, its training time and test time are shorter than the other algorithms we tested, and its accuracy is much higher. The overall performance of D-ELM is better than that of ELM and SVM.

System Interface
We constructed a network model and classification optimizer based on the proposed algorithm by integrating Anaconda 3.5 and TensorFlow. We then constructed a real wood plate defect identification system in the C# development language on the Microsoft Visual Studio 2017 open platform. The system can identify defects in solid wood plate images and provide their position and size information. We also used a Microsoft SQL server 2012 database to store the information before and after processing.
Our experiment on deep network feature learning mainly involved the implementation of the network framework and the training model for wood recognition. The system is based on the network training model discussed in this paper; it can be used to detect defects in the wood image database on a single sheet and display the coordinates in the X and Y directions of the defects, as shown in Figure 11. On the left side, the scanned wood images are displayed with defects marked in Figure 10. Accuracy of D-ELM and ELM after multiple tests. Table 3 shows the results of our algorithm accuracy tests, as mentioned above. D-ELM has a higher average accuracy rate but lower standard deviation than ELM. The accuracy and stability of D-ELM network were both accurate and stable. As a result, the performance of the classifier was improved. We added an SVM classifier to the experiment to further assess the depth extreme learning machine. Table 4 shows the accuracy and timing of D-ELM and SVM training tests on all samples, where D-ELM again has the highest accuracy in both training and testing. Although the training time and network layer quantity are higher in D-ELM, its training time and test time are shorter than the other algorithms we tested, and its accuracy is much higher. The overall performance of D-ELM is better than that of ELM and SVM.

System Interface
We constructed a network model and classification optimizer based on the proposed algorithm by integrating Anaconda 3.5 and TensorFlow. We then constructed a real wood plate defect identification system in the C# development language on the Microsoft Visual Studio 2017 open platform. The system can identify defects in solid wood plate images and provide their position and size information. We also used a Microsoft SQL server 2012 database to store the information before and after processing.
Our experiment on deep network feature learning mainly involved the implementation of the network framework and the training model for wood recognition. The system is based on the network training model discussed in this paper; it can be used to detect defects in the wood image database on a single sheet and display the coordinates in the X and Y directions of the defects, as shown in Figure 11. On the left side, the scanned wood images are displayed with defects marked in green boxes. The top right side of the interface shows the coordinates of each defect, the cutting position of the plank, and the type of defects. Below the table are the total numbers of defects identified by the machine and the recognition rate. green boxes. The top right side of the interface shows the coordinates of each defect, the cutting position of the plank, and the type of defects. Below the table are the total numbers of defects identified by the machine and the recognition rate. Figure 11. System interface diagram based on depth learning.

Conclusions
The depth extreme learning machine proposed in this paper has reasonable dimensions, effectively manages heterogeneous data, and works within an acceptable run time. Our results suggest that it is a promising new solution to problems such as obtaining marking samples, constructing features, and training. It has excellent feature extraction ability and fast training time. Based on the method of machine learning, The NSST transform is used to preprocess the original image (i.e., reduce its complexity and dimensionality while minimizing the down-sampling process in CNN), then SLIC is applied to optimize the CNN model training process. This method effectively reduces the redundancy of local image information and extracts relatively accurate supplementary feature contours. The optimized CNN is then used to extract wood image features and secure corresponding image features. The feature is input to the ELM classifier and the parameters of the related neural network are optimized. The GA is used to select the initial weight threshold of ELM to improve the prediction accuracy and stability of the network model. Finally, the image data to be tested is input to the well-trained network model and final test results are obtained.
We also compared the stability of D-ELM and ELM network models. The standard deviation of D-ELM was only 0.0967 and the accuracy of D-ELM improved by about 3% compared to ELM; the stability of the D-ELM network was also found to be higher and less affected by test quantity than ELM. We also found that D-ELM has an accuracy of up to 96.72% and a shorter test time than ELM or SVM at only 187 ms. The D-ELM network model is capable of highly accurate wood defect recognition within a very short training and detection time.

Conclusions
The depth extreme learning machine proposed in this paper has reasonable dimensions, effectively manages heterogeneous data, and works within an acceptable run time. Our results suggest that it is a promising new solution to problems such as obtaining marking samples, constructing features, and training. It has excellent feature extraction ability and fast training time. Based on the method of machine learning, The NSST transform is used to preprocess the original image (i.e., reduce its complexity and dimensionality while minimizing the down-sampling process in CNN), then SLIC is applied to optimize the CNN model training process. This method effectively reduces the redundancy of local image information and extracts relatively accurate supplementary feature contours. The optimized CNN is then used to extract wood image features and secure corresponding image features. The feature is input to the ELM classifier and the parameters of the related neural network are optimized. The GA is used to select the initial weight threshold of ELM to improve the prediction accuracy and stability of the network model. Finally, the image data to be tested is input to the well-trained network model and final test results are obtained.
We also compared the stability of D-ELM and ELM network models. The standard deviation of D-ELM was only 0.0967 and the accuracy of D-ELM improved by about 3% compared to ELM; the stability of the D-ELM network was also found to be higher and less affected by test quantity than ELM. We also found that D-ELM has an accuracy of up to 96.72% and a shorter test time than ELM or SVM at only 187 ms. The D-ELM network model is capable of highly accurate wood defect recognition within a very short training and detection time.