Quality Classiﬁcation of Dragon Fruits Based on External Performance Using a Convolutional Neural Network

: Currently, most agricultural products in developing countries are exported to many countries around the world. Therefore, the classiﬁcation of these products according to different standards is necessary. In Vietnam, dragon fruit is considered as the fruit with the highest export rate. Currently, the classiﬁcation of dragon fruit is carried manually, lead to low-quality classiﬁcation high labor costs. Therefore, this study describes an automatic dragon fruit classifying system using non-destructive measurements, based on a convolutional neural network (CNN). This classifying system uses a combination of a model of machine learning and image processing using a convolutional neural network to identify the external features of dragon fruits; the fruits are then classiﬁed and evaluated by groups. The dragon fruit is recognized by the system, which extracts the objects combined with the signal obtained from the loadcell to calculate and determine dragon fruit in each group. The training data are collected from the dragon fruit processing system, with a dataset of images obtained from more than 1287 dragon fruits, to train the model. In this system, the classiﬁcation of the processing speed and accuracy are the two most important factors. The results show that the classiﬁcation system achieves high efﬁciency. The system is effective with existing dragon fruit types. In Vietnamese factories, the processing speed of the system increases the sorting capacity of export packing facilities to six times higher than that of the manual method, with an accuracy of more than 96%.


Introduction
Dragon fruit, known as pitaya, is a tropical fruit that is widely grown in about 20 countries and territories in Asia, the Middle East and America, with a large concentration in the Asia-Pacific region, especially in Vietnam, Thailand, Indonesia, Philippines, Mainland China, and Taiwan, because of its appealing taste and rich nutritional properties [1].At the present time, there are two species of dragon fruit: Hylocereus undatus (white flesh) and H. polyrhizus (red flesh).They are widely cultivated in 63/65 provinces/cities of Vietnam; total production for export is more than a million tons [2].Dragon fruit contains many nutrients beneficial to human health, such as vitamin A, vitamin C, moisture, protein, etc.The quality of dragon fruit is evaluated largely by the shape and external defects of the fruit.In Vietnam, the annual output of dragon fruit is increasing; dragon fruit is mainly exported to other countries, typically China, Thailand, and the US, among others.According to the Asia Foundation 2019, 99% of dragon fruit on the Chinese market was imported from Vietnam.After harvesting, dragon fruit are classified according to the standards of each different country in order to export them.The current classification is carried out by humans.This process is time-consuming and its accuracy is not high.Dragon fruit is divided into different groups, depending on the weight, volume and external defects, according to the import standards of individual countries.
In the agricultural sector, the quality evaluation of agricultural products is arguably the field of greatest interest at the time of writing.Because the climate and environmental characteristics of each country are different, the fruits are also different, and there are many studies on the prediction, evaluation, and classification of fruits by authors such as [3], who used a convolutional neural network to identify and classify different types of fruit with a high accuracy, of up to 98.1%.The authors of [4] proposed the use of an artificial neural network to classify apples, with an accuracy of 91.5%, while the classification of dragon fruit using the backpropagation method achieved an accuracy of 86.67% while the ripeness of dragon fruit was evaluated using a smaller VGGNet-like network, with an accuracy 96.67% [5].A grading and sorting technique for dragon fruit using machine learning algorithms [6] was proposed, in which three models SVM, ANN, CNN, were used independently to evaluate the outside of the fruit in order to classify it into categories.The authors of [7] proposed an eight-layer CNN model and adjusted the parameters for the model to achieve an accuracy of 95.67% in the sorting of mangoes.In [8], using four models of machine learning, Random Forest (RF), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN), a high accuracy, of up to 98.1%, was achieved.In this study, the aim is to propose a fully automatic dragon fruit classification system that is suitable for use in factories.The proposed approach is the combination of traditional supervised machine learning (ML) models, such as support vector machine and deep convolutional neural networks and image processing algorithms for the conversion of binary images and the definition of contours.Supervised ML is based on previously labeled data to identify unlabeled data.There have been different approaches to the use of CNN for classification, such as the use of deep learning to sort tomatoes [9], the classification of chili by CNN [10] and the classification of fruit and vegetables at a supermarket self-checkout [11].
Currently, dragon fruit accounts for a significant percentage of fruit exports in Vietnam.Dragon fruit are evaluated and sorted according to the standard of the importing country.The workshops for evaluating and sorting dragon fruit are based on human workers, who manually rotate the fruit to determine defective areas qualitatively and use basic scales to weigh the fruit.These methods cause grading errors.Therefore, this study proposes an automatic dragon fruit sorting system to replace humans in sorting to reduce costs and increase grading accuracy.According to a local survey, large workshops export 100-200 tons/day.This sorting process require a large number of workers, so proposing a dragon fruit classification system is highly necessary in Vietnam.In this study, a CNN model was used to extract the features of the dragon fruit, namely the scaly spikes and the tail.The main purpose of this work is to apply this research in reality.To avoid confusion between the color of the background and the color of the defect, the SVM model was used to segment the dragon fruit.We propose self-training in the use of the SVM model to improve the system's accuracy.The automatic sorting system offers a yield of approximately 3 tons/h for a module.The results of the experiments suggest a system accuracy of well over 96%.

Structure of Dragon Fruit Grading System
This study was conducted to meet the needs of dragon fruit exports in Vietnam.The hardware was designed in accordance with the structure of the dragon fruit.The weight of the dragon fruit is obtained through the loadcell sensor signal, which is processed in the load cell and filtered by the Kalman filter.
The grading system consists of two main parts.The first part is the image processing chamber, which is arranged in different directions to minimize the shadow of the fruit.The images of dragon fruit are collected to extract objects such as length, width and height and to detect defects in the fruit.In this study, image processing and machine learning algorithms were applied to increase the accuracy and speed of processing.The second part is the weight section, along with the dragon fruit sorting tray.After extracting the parts of the dragon fruit, the fruit is passed through part 2 for weighing and to send signals to the central processor in order to sort the fruit into the G1, G2, or G3 categories, according the requirements of each export market.The operating principle of the system is shown in Figure 1.The images of the dragon fruit are taken in the image processing chamber and the central processor extracts the features of the dragon fruit combined with the signal measured from the loadcell to classify the dragon fruit.After classification, the conveyor transfers the dragon fruit to the correct group.send signals to the central processor in order to sort the fruit into the G1, G2, or G3 categories, according the requirements of each export market.The operating principle of the system is shown in Figure 1.The images of the dragon fruit are taken in the image processing chamber and the central processor extracts the features of the dragon fruit combined with the signal measured from the loadcell to classify the dragon fruit.After classification, the conveyor transfers the dragon fruit to the correct group.

Recognizing Dragon Fruit Using Self-Train Method
Feature recognition is a classification that presents challenges to computer vision using machine learning algorithms.In this section, a self-training model for the SVM model is proposed to detect the color threshold of dragon fruit.In this study, we did not use CNN to identify the fruit directly in the background because the defect area of the dragon fruit greatly determines the classification of dragon fruit, in accordance with the importing standards of most countries.After surveying the color of different dragon fruits, it was found that there are some groups of defects that have a color range close to the background color.Therefore, to avoid confusion in the system, we propose to remove the background before extracting the features of the dragon fruit.This process is applied to ensure the accuracy of the defect detection.The color threshold of the dragon fruit images is a parameter that affects the accuracy of the dragon fruit recognition.There is no general color threshold for all the cultivars of dragon fruit, so its recognition still depends on testing each dragon fruit with different color threshold values to find the most suitable threshold.This study proposes a self-learning algorithm for the SVM model to segment dragon fruit in the images by gathering all the color values (including the dragon fruit and background).Subsequently, the SVM model is applied to classify the color values for both the dragon fruit and the background classes.By comparing the dragon fruit sizes (length, width, defect) with the size standard, dragon fruit recognition can be made much more accurate.

Self-Learning Algorithm in Dragon fruit Recognition Problem
Self-learning is a wrapping method for semi-supervised learning algorithms, which use its predictions in place of the missing target values for the training model.Self-learning aims to involve the unlabeled set X 1 to train a better classifier primed by the labeled set X 0 .The fruit's external features, length, width, defects and peel color, are extracted by a series of algorithms processed on images.
As shown in Figure 2, the proposed system is divided into the following three stages.In stage 1, the classifying model is first trained with X 0 and Y 0 in a supervised manner, which allows the color of unlabeled images X 1 to be clustered between background and dragon fruit.In stage 2, the color sets Y 1 in stage 1 are used to detect dragon fruit based on the three steps of acquiring the dragon fruit threshold, initializing the mask, and

Recognizing Dragon Fruit Using Self-Train Method
Feature recognition is a classification that presents challenges to computer vision using machine learning algorithms.In this section, a self-training model for the SVM model is proposed to detect the color threshold of dragon fruit.In this study, we did not use CNN to identify the fruit directly in the background because the defect area of the dragon fruit greatly determines the classification of dragon fruit, in accordance with the importing standards of most countries.After surveying the color of different dragon fruits, it was found that there are some groups of defects that have a color range close to the background color.Therefore, to avoid confusion in the system, we propose to remove the background before extracting the features of the dragon fruit.This process is applied to ensure the accuracy of the defect detection.The color threshold of the dragon fruit images is a parameter that affects the accuracy of the dragon fruit recognition.There is no general color threshold for all the cultivars of dragon fruit, so its recognition still depends on testing each dragon fruit with different color threshold values to find the most suitable threshold.This study proposes a self-learning algorithm for the SVM model to segment dragon fruit in the images by gathering all the color values (including the dragon fruit and background).Subsequently, the SVM model is applied to classify the color values for both the dragon fruit and the background classes.By comparing the dragon fruit sizes (length, width, defect) with the size standard, dragon fruit recognition can be made much more accurate.

Self-Learning Algorithm in Dragon Fruit Recognition Problem
Self-learning is a wrapping method for semi-supervised learning algorithms, which use its predictions in place of the missing target values for the training model.Self-learning aims to involve the unlabeled set X 1 to train a better classifier primed by the labeled set X 0 .The fruit's external features, length, width, defects and peel color, are extracted by a series of algorithms processed on images.
As shown in Figure 2, the proposed system is divided into the following three stages.In stage 1, the classifying model is first trained with X 0 and Y 0 in a supervised manner, which allows the color of unlabeled images X 1 to be clustered between background and dragon fruit.In stage 2, the color sets Y 1 in stage 1 are used to detect dragon fruit based on the three steps of acquiring the dragon fruit threshold, initializing the mask, and cropping the dragon fruit area.Next, the cropped images are used to extract external features.Finally, in stage 3, the dragon fruit images with the extraction error satisfying setup value are added to the initial data set used to train the model.Unqualified samples are returned to the unlabeled dataset.

Support Vector Machin-Training Model
Support vector machine (SVM) [12] is a common algorithm of machine learning, whose main use in real-world problems is in classification.SVM offers the advantage of high accuracy but does not need large train data [13].In this section, SVM is used to separate dragon fruit and background color.The input images are an image taken from the image processing chamber with the RBG color, and the color values are manually labeled before being included in the training data set.The SVM finds the hyperplane that divides the two classes by choosing the extremes of the point or the vectors of the points of each class.The distance between the points in the class to the plane is calculated by Equation ( 1).The SVM algorithm finds the boundary plane between the background color and the dragon fruit color such that the distance from the pixel xB, xR, xG to the dividing surface of each layer is the largest.
Feature recognition is a technology used to recognize and monitor objects in images or videos using computer vision and machine learning.The model uses the original data, the images from the image processing chamber are labeled with two classes, with "1" being the dragon fruit, "−1" being the background color.After the SVM model is trained, the pixels of the image are considered to determine the color of the dragon fruit or background.These weights are improved due to the self-training process; after this stage, the background is removed, which helps to improve the accuracy of subsequent identifications, without confusion between the background and the dragon fruit when detecting in the image processing.

Data, Augmentation
Due to the specificity of the climate and environment of each country, the dataset sources are still very limited in terms of agricultural product data.In the study, images of dragon fruit were collected from dragon fruit farms and packaging facilities and an image processing chamber with appropriate lighting to contribute to increasing the stability and accuracy of the model.The database contains images with many different shooting directions.The data comprised 5000 images for training and 1500 images for testing.The test data were the randomly taken pictures from the image-processing chamber to check the accuracy of the model when put into reality.Due to the limited data sources on ag-

Support Vector Machin-Training Model
Support vector machine (SVM) [12] is a common algorithm of machine learning, whose main use in real-world problems is in classification.SVM offers the advantage of high accuracy but does not need large train data [13].In this section, SVM is used to separate dragon fruit and background color.The input images are an image taken from the image processing chamber with the RBG color, and the color values are manually labeled before being included in the training data set.The SVM finds the hyperplane that divides the two classes by choosing the extremes of the point or the vectors of the points of each class.The distance between the points in the class to the plane is calculated by Equation (1).The SVM algorithm finds the boundary plane between the background color and the dragon fruit color such that the distance from the pixel x B , x R , x G to the dividing surface of each layer is the largest.
Feature recognition is a technology used to recognize and monitor objects in images or videos using computer vision and machine learning.The model uses the original data, the images from the image processing chamber are labeled with two classes, with "1" being the dragon fruit, "−1" being the background color.After the SVM model is trained, the pixels of the image are considered to determine the color of the dragon fruit or background.These weights are improved due to the self-training process; after this stage, the background is removed, which helps to improve the accuracy of subsequent identifications, without confusion between the background and the dragon fruit when detecting in the image processing.

Data, Augmentation
Due to the specificity of the climate and environment of each country, the dataset sources are still very limited in terms of agricultural product data.In the study, images of dragon fruit were collected from dragon fruit farms and packaging facilities and an image processing chamber with appropriate lighting to contribute to increasing the stability and accuracy of the model.The database contains images with many different shooting directions.The data comprised 5000 images for training and 1500 images for testing.The test data were the randomly taken pictures from the image-processing chamber to check the accuracy of the model when put into reality.Due to the limited data sources on agricultural products, the creation of data on dragon fruit is also a small contribution of this study.Some cases lead to overtraining model training.Some data augmentation methods were used in this study, such as image rotation, gamma correction [14] and scale, which increased the number of databases for the model, helping to increase the accuracy of the model training to avoid over-training.
Currently, two main types of Vietnamese dragon fruit are exported: red flesh and white flesh.Although there are more different types, their appearance is similar.To export to foreign markets, dragon fruit should be classified according to the different standards of each country; mainly, however, it is classified according to weight, volume, shape, defects and the number of broken dragon fruit scales (because this affects the quality of dragon fruit during storage and cleaning).In order to determine the characteristics of the dragon fruit, we use a combination of image processing and CNN.To determine the actual weight of the dragon fruit using the loadcell sensor with noise filtering, the dragon fruit is put into an image processing chamber to obtain an image of the dragon fruit in real time.In the image processing chamber, an RBG camera collects images of the dragon fruit from which to calculate, extract and compare the characteristics of dragon fruit and find defects, such as deep marks, cracks and white spots.The central processing unit calculates the group of dragon fruit by means of a conveyor system.

Deep Convolution Neural Network
Along with the development of computer vision, there are many studies on recognition and classification problems, such as heartbeats [15], traffic [16], tomato detection [17] and the identification of wood veneer surface defects [18].The basic structure of CNN contains convolution layers, nonlinear activation functions, pooling layers, fully-connected layers and dropouts.In this study, we propose a CNN to recognize the scaly spikes and tails of dragon fruit; the network configuration is the standard choice for computer vision problems.The biggest difference between this model and others is how much data must be used to train the model to achieve high accuracy.The collected dragon fruit images are used to train the dragon tail and scaly spikes recognition model.The RBG image, which has a size of (416 × 416 × 3), measured in terms of L: Length, W: Width, and C: Chanel, respectively, is then convoluted with kernels (L k × W k × C).To aggregate the contributions of the previous layers, important weights for the model are created.A bias is then added, which passes through a nonlinear activation function.In traditional neural networks, it is common to use the sigmoid function.The function ReLU, a rectified linear unit, is shown by Equation (2): Furthermore, the pooling layers used to reduce the number of neurons increase the computational time of the model and decrease overfitting.Assuming the pooling region is A, the activation set is S. Max-pooling [19] is one of the most commonly used methods.It keeps the parameter with the highest value in each region defined by Equation (3).Average-pooling calculates the average value of each zone, defined by Equation (4): where |N A | is the number of the elements.Many different CNN network configurations are used in fruit classification.In another study, it was proposed that 13 layers of deep neural convolutional could be used to recognize different fruits [20], containing only 6 convolutional layers due to the presence of 7 layers without learnable weights.In this study, we propose a CNN network with 8 convolutional layers, 2 of which are fully connected, along with 4 max-pooling layers; the last layer is the softmax layer, which capitalizes on the softmax function, which is also called multinomial logistic regression, to extract features of the dragon fruit.The input image layer allows the receipt of RBG images in the model, using only the 3 × 3 kernel to reduce the number of parameters to increase the computational speed of the system.An Adam optimizer is used in this study, and learning rates are set to 0.001; these are decreased by a factor of 10 every 100 epochs.

Image Processing
At this stage, the image of the dragon fruit has identified the scaly spikes and the tail of the dragon fruit.The background is removed by the SVM model.The purpose of this stage is to determine the size of the dragon fruit and then to predict the actual volume of the dragon fruit.Besides, this process is required to determine defects to classify the dragon fruit.Unlike other fruits, it is very difficult to determine the length, width, and height of dragon fruit accurately.To accurately evaluate the size of dragon fruit, we remove the tail and the scaly spikes; the outer lines of the dragon fruit are interpolated to restore the obscured points to extract the features of the dragon fruit.
Image processing to extract the features of dragon fruit includes five steps: − Finding the contours of the dragon fruit; − Subtracting the tail and the scaly spikes of the identified dragon fruit from the CNN section; − Interpolating to approximate the outer lines of the dragon fruit; − Finding the defects of the fruit; − Calculating to predict the classification group of the fruit; The image noise by the camera increases the error rate in the extracted features.Therefore, image de-noising is an important image processing task.A beneficial aspect of the image noise filter is that it completely removes the noise while preserving the edges.Therefore, total variation conditioning [21] is used in this paper to filter out noise, where undesired details are removed while important details such as edges are retained.The indicator of total variation (ζ) is calculated in Equation ( 5) where u is the input images and m and n are row column identifiers, respectively.In contour finding, the contour is found using the contour-finding algorithm [22].A contour can be simply defined as a curve that combines all the consecutive points along the contour of an object.With the same color or intensity, the contour of the object is defined as v(s) = (x(s), y(s)), where x(s) and y(s) are the set of (x) and (y) coordinates of points on the contour line.The contours are found based on the local minima of the energy function (E * snake ) presented in Equation ( 6): where: E int is the internal energy, E ext is the external energy, α is the elasticity of the curve (α > 0), β is the rigidity coefficients of the curve (β > 0) and ∇I is the gradient of the image intensity, λ > 0.

Feature Extraction
In Vietnam, dragon fruit is currently the most commonly planted fruit.Due to the nature of the plant, it is not grown in a greenhouse.In this tree, it is inevitable that plant diseases such as bee and mosquito bites create marks, bruises, cracks, etc.It is necessary to detect and remove unqualified dragon fruits from the classification line.In this study, we used image processing to detect defects in dragon fruit because processing speed is preferred in this case.The general principles in image processing to detect defects are as follows: thresholding, noise filtering, object edge detection.Because dragon fruit is an unusual fruit, with many scaly spikes, it is easy for the system to misunderstand defects; therefore, before detecting a defect is detected, the image is converted to a binary image and the scaly spikes and the dragon fruit's tail are removed.Therefore, when the dragon fruit is rotated, the system detects the defects and determines the size of the fruit.The image is cropped to reduce its size and to increase the processing speed of the system; each dragon fruit is cropped into different image frames.Based on the actual survey, the cracks of the dragon fruit are well recognized by the system when converting to the binary image and applying the threshold method to determine the cracks; this is the most dangerous defect.Therefore, the system immediately detected a deep crack, as well as other defects within the limits allowed by the import standards of the countries to which the fruit is exported.To classify the defects, we need to calculate the total pixels of the entire dragon fruit and the pixels of the defective part to customize according to the requirements of each importing country.After determining the objects, the next task is to determine the ratio of the pixel size to the actual size given in Equation ( 7), which depends mainly on the camera correction factor proposed by Park et al. [23].
Let S = {(x i , y i )|0 ≤ i ≤ m} be the set of coordinates of all the pixel points of the dragon fruit, where m is the number of extreme points in the convex surrounding the dragon fruit and x i , y i are the x and y coordinates of the i th point.The length and width of the dragon fruit are estimated based on the four extreme points (left, right, top and bottom) calculated in Equations ( 8)-( 11): P T = (x T , y T ), y T = min 0≤i<m {y i /(x i , y i ) ∈ S} (10) where P L is the extreme on the left, P B is the extreme point on the right, P T is the upper extreme point and P B is the bottom extreme point.
The area of defects (A defect ) and the total area of the dragon fruit (A total ) are calculated with the support of OpenCV library.The dragon fruit is sorted according to the ratio of the defective area and the total area of the whole dragon fruit.The defect rate of the fruit is calculated as in (14).
The length and width of the dragon fruits are extracted to classify the shape of the dragon fruit.The shape of the dragon fruit is balanced and it must not be too long or too short.Besides, thanks to the length, width and height to approximate the volume of the dragon fruit, the mellowness of the dragon fruit is calculated and predicted; this is analyzed more closely in the following article.To calculate the weight of the dragon fruit, we use a loadcell sensor arranged on the classification conveyor.

Update Data
During this phase, a set of samples X 1 and corresponding dummy labels Y 1 is selected to update the data based on the error evaluation of the extracted features.It is difficult to learn a model and to optimize approximate labels on data without annotations.Therefore, we used self-learning to generate estimated labels, also known as "pseudo-labels", from the predictions with the highest confidence, trusting that they are mostly accurate and approximate to actual values.The remaining, less reliable, false labels were kept for future predictions.The condition of Equation ( 15) is examined to see whether X 1 and Y 1 can be used for updating the data: where T L is the threshold error of the length feature, T H is the threshold error of the width feature, T De is the threshold error of the defect feature, x i is the i th unlabeled sample, and y i is the predicted label of x i .The T values are the threshold error for each feature, evaluated and selected through the experimental process, in the hope that the SVM model can segment the dragon fruit accurately, contributing to the accuracy of the whole classification system.

Classification Based on Extracted Features
Dragon fruit is classified by calculating penalty points according to the standards of weight, external features, and defects.The weight is prioritized, then the system is classified according to the remaining standard by calculating penalty points.Depending on the import standard of each country to which the fruit is exported, dragon fruit have different classification forms, but in general, the standards of weight and defects are considered.In this study, we only consider the standards in markets with a high import rate of Vietnamese dragon fruits, such as the market in China, which accounts for 80% of the total dragon fruit exports from Vietnam (The Asia Foundation 2019) and adopts the standards listed in Table 1.The process of evaluating and classifying according to the order of dragon fruit classification is to evaluate the weight of the dragon fruit first, and then to calculate the defects in the fruit according to different standards of each area.The group classification of the dragon fruit is calculated according to the following Equation ( 16): where G is group into which the dragon fruit classified, K W is the classification coefficient according to the weight of the fruit K W ∈ {1,2,3}, with 1 corresponding to G 1 , p d is the penalty points of the defect p d ∈ {0,1,2} with 0ais the penalty point corresponding to Group 1 (G1) and increasing with the increasing number of the group's dragon fruit and p is the penalty points of the other factors p ∈ {0,1,2}, which are calculated by the number of broken scaly spikes, with 0 as the penalty point of group 1.
Table 1.The standards for dragon fruit.

Results and Experiments
In this study, images were collected from 1287 dragon fruits at the planting site and export packing facilities from January to October.The data were large enough to train the model; this training is also one of the contributions of this study.The data were enough for the system to identify the components of the dragon fruit, and to combine with the image processing method to extract features of the fruit.This was followed by the weight measurement, performed by a loadcell sensor.The processing center decided classified the dragon fruit by calculating the score according to the standard of each country.The classification system was built based on the requirements of the dragon-fruit-exporting companies.The system was tested with an accuracy of up to 96.38% for the three groups of dragon fruit, G 1 , G 2 and G 3 , with group G 1 being the most accurate.Because the size of the defect area greatly determines the group of the dragon fruit, we propose an additional SVM model to remove the background to avoid confusion in defects.To improve the accuracy of the SVM model, we propose the self-training method.The pixel values are hand-selected and labeled by agricultural experts so that the samples are labeled correctly for the first training.In sample collection, incomplete data can cause inaccurate prediction results.However, excessive sample data resulted in a lot of manual work, so self-learning was applied to the SVM model with condition-based adjustment (15).The color values demonstrated fewer errors than the error threshold of the length and width; defects were used to update the data, which increased the accuracy of the model.
The classification of dragon fruit is mainly based on the external features of the fruit, such as length, width, shape and external defects.The processing of the automatic sorting system is shown in Figure 3.The dragon fruit is brought by the worker from the input of the image processing chamber; the images are obtained through a camera arranged in the image processing chamber with reasonable light to minimize the shadow of the dragon fruit; subsequently, the noise is reduced for image processing.A raw image obtained from the camera is shown in Figure 4 The image is transmitted to the central processor to be calculated and to classify the dragon fruit.The weight of the fruit is obtained from the loadcell sensor through Kalman noise filtering.Two signals, which refer to shape and weight, are combined to classify the dragon fruit according to of the standards of each importing country.The system is shown in Figure 5.After using CNN to identify the extracted features of the scaly spikes and tails of the dragon fruit, the system removes the scaly spikes and tails and, subsequently, the outer contours of the dragon fruit are interpolated approximately, extracting the length and width, and sorted according to the proportions of the dragon fruit in terms of shape.The locations of the defects are then calculated.The processing is presented in Figure 6. Figure 6a presents the result after removing the scaly spikes and tail; Figure 6b displays the extraction of the length and width properties of the dragon fruit; Figure 6c presents the region cavity description and defect calculation.In machine learning, the image is a convolution with kernels of sizes 3 × 3 to obtain a highly accurate model.The filters on the convolution layer help to remove the noise in the image.The use of CNN to identify scaly spikes and the tail of the dragon fruit with the structure is proposed; the identification results are shown in Figure 7.The accuracy of the proposed CNN network is 97.38%.All the model code is written in Python programming language, with the support of the available libraries.The results satisfy the requirements of the packaging facilities after experimental testing with 3760 dragon fruits, including 1262 G1, 1321 G2, 1177 G3.The accurate results of the experiments are presented in Table 2, testifying to the accuracy of the whole classification system, which was implemented under the supervision of experts in the classification of dragon fruits according to the current standards for market export.The accuracy of the automatic dragon fruit sorting system is more than 96%, which meets the requirements of the enterprise; the accuracy is higher than that of the worker's qualitative classification method.In machine learning, the image is a convolution with kernels of sizes 3 × 3 to obtain a highly accurate model.The filters on the convolution layer help to remove the noise in the image.The use of CNN to identify scaly spikes and the tail of the dragon fruit with the structure is proposed; the identification results are shown in Figure 7.The accuracy of the proposed CNN network is 97.38%.All the model code is written in Python programming language, with the support of the available libraries.The results satisfy the requirements of the packaging facilities after experimental testing with 3760 dragon fruits, including 1262 G1, 1321 G2, 1177 G3.The accurate results of the experiments are presented in Table 2, testifying to the accuracy of the whole classification system, which was implemented under the supervision of experts in the classification of dragon fruits according to the current standards for market export.The accuracy of the automatic dragon fruit sorting system is more than 96%, which meets the requirements of the enterprise; the accuracy is higher than that of the worker's qualitative classification method.
highly accurate model.The filters on the convolution layer help to remove the noise in the image.The use of CNN to identify scaly spikes and the tail of the dragon fruit with the structure is proposed; the identification results are shown in Figure 7.The accuracy of the proposed CNN network is 97.38%.All the model code is written in Python programming language, with the support of the available libraries.The results satisfy the requirements of the packaging facilities after experimental testing with 3760 dragon fruits, including 1262 G1, 1321 G2, 1177 G3.The accurate results of the experiments are presented in Table 2, testifying to the accuracy of the whole classification system, which was implemented under the supervision of experts in the classification of dragon fruits according to the current standards for market export.The accuracy of the automatic dragon fruit sorting system is more than 96%, which meets the requirements of the enterprise; the accuracy is higher than that of the worker's qualitative classification method.The accuracy of the SVM model was evaluated with a data set of 3000 color values, including 1500 values of background color and 1500 values of fruit color.The self-learning solution was designed to sample the pixel values in the images of the dragon fruit, and the color values featured fewer errors than the error threshold of the length and width.Defects are used to update the data, which increases the accuracy of  The accuracy of the SVM model was evaluated with a data set of 3000 color values, including 1500 values of background color and 1500 values of fruit color.The self-learning solution was designed to sample the pixel values in the images of the dragon fruit, and the color values featured fewer errors than the error threshold of the length and width.Defects are used to update the data, which increases the accuracy of the model.The accuracy of the model is shown in Table 3,   The results of the comparison with other state-of-the-art methods are presented in Table 4, which shows the accuracy of the proposed CNN to be 97.38%.The combined use of SVM to segment dragon fruit and CNN to extract features of the dragon fruit was compared with other methods.The results of the sorting of the three groups of dragon fruit, G1, G2 and G3, are presented in Table 5, which shows that the accuracy of this approach was 96.38%.The dataset for this study, including all the images, was uploaded to zenodo.org(29 October 2021) [24].

Conclusions
The automatic sorting of dragon fruit with the support of machine learning method and image processing was studied, based on the external features and actual weight of the fruit.The study offers the following main conclusions: − The self-training model for the SVM model is proposed to increase the accuracy of the image segmentation.The accuracy of the test before and after applying the self-train model was 85.9% and 92.9%, respectively.− The automatic dragon fruit classification system, which offers an accuracy of more than 96%, is proposed to solve the problem of evaluating and sorting dragon fruit for workshops, reducing labor for worker and reducing the cost of the process.− CNN is used to identify the characteristics of dragon fruit, namely its scaly spikes and tail, with an accuracy of 97.38%.
Furthermore, the research was carried out in consultation with enterprises in accordance with the actual requirements, which can be put into the application as soon as the research is completed.

Figure 1 .
Figure 1.The operating principle of the system.

Figure 1 .
Figure 1.The operating principle of the system.
Appl.Sci.2021, 11, x FOR PEER REVIEW 4 of 13 cropping the dragon fruit area.Next, the cropped images are used to extract external features.Finally, in stage 3, the dragon fruit images with the extraction error satisfying setup value are added to the initial data set used to train the model.Unqualified samples are returned to the unlabeled dataset.

Figure 2 .
Figure 2. The framework of the self-learning algorithm in the dragon fruit recognition problem.

Figure 2 .
Figure 2. The framework of the self-learning algorithm in the dragon fruit recognition problem.

13 Figure 3 .
Figure 3.The process of extracting dimensions and defects.Figure 3. The process of extracting dimensions and defects.

Figure 3 .
Figure 3.The process of extracting dimensions and defects.Figure 3. The process of extracting dimensions and defects.

Figure 3 .
Figure 3.The process of extracting dimensions and defects.

Figure 4 .
Figure 4.The raw images from image processing chamber.

Figure 4 .
Figure 4.The raw images from image processing chamber.

Figure 3 .
Figure 3.The process of extracting dimensions and defects.

Figure 4 .
Figure 4.The raw images from image processing chamber.

Figure 5 .
Figure 5.The actual system.Figure 5.The actual system.

Figure 6 .
Figure 6.Extracting the feature of the dragon fruit.(a) removing the scaly spikes and tail; (b) extracting the feature; (c) detecting defects of the fruit.

Figure 6 .
Figure 6.Extracting the feature of the dragon fruit.(a) removing the scaly spikes and tail; (b) extracting the feature; (c) detecting defects of the fruit.

Figure 7 .
Figure 7. Identifying the scaly spikes and the tail of the dragon fruit using a convolutional neural network.

Figure 7 .
Figure 7. Identifying the scaly spikes and the tail of the dragon fruit using a convolutional neural network.
with (a) SVM model without self-training, (b) after ap-plying the self-training model.The results from the experimental test show that the self-training model is effective (the accuracy increases by 7%) because the color of dragon fruit is different across seasons and sites of cultivation.

Table 2 .
The accuracy of the system.

Table 2 .
accuracy of the system.

Table 3 .
The accuracy of the SVM model.

Table 4 .
Comparison with deep learning methods.

Table 5 .
Comparison with other approaches.