A Method of Apple Image Segmentation Based on Color-Texture Fusion Feature and Machine Learning

: Apples are one of the most kind of important fruit in the world. China has been the largest apple producing country. Yield estimating, robot harvesting, precise spraying are important processes for precise planting apples. Image segmentation is an important step in machine vision systems for precision apple planting. In this paper, an apple fruit segmentation algorithm applied in the orchard was studied. The effect of many color features in classifying apple fruit pixels from other pixels was evaluated. Three color features were selected. This color features could effectively distinguish the apple fruit pixels from other pixels. The GLCM (Grey-Level Co-occurrence Matrix) was used to extract texture features. The best distance and orientation parameters for GLCM were found. Nine machine learning algorithms had been used to develop pixel classiﬁers. The classiﬁer was trained with 100 pixels and tested with 100 pixels. The accuracy of the classiﬁer based on Random Forest reached 0.94. One hundred images of an apple orchard were artiﬁcially labeled with apple fruit pixels and other pixels. At the same time, a classiﬁer was used to segment these images. Regression analysis was performed on the results of artiﬁcial labeling and classiﬁer classiﬁcation. The average values of Af (segmentation error), FPR (false positive rate) and FNR (false negative rate) were 0.07, 0.13 and 0.15, respectively. This result showed that this algorithm could segment apple fruit in orchard images effectively. It could provide a reference for precise apple planting management.


Introduction
Apples are one of the most important fruit in the world [1]. China is the largest apple producer country in the world. It takes a lot of labor to plant and harvest apples. With the continuous expansion of planting area and yield, precision planting and mechanized harvesting of apples are urgently needed [2,3]. Yield estimating, robot harvesting, precise spraying are important processes for precise planting [4][5][6][7]. Image segmentation is an important step for these processes [8].
In recent years, some researchers have developed many methods to segment apple fruit in the images. Researchers used color cameras, spectral cameras, and thermal cameras to obtain images of apple trees in orchards and segment apple fruit from others in these pictures [9,10]. Spectral cameras and thermal cameras can obtain heat and spectral information [11]. With this information, apple fruit can be recognized easily on the images [12]. However, the data obtained by these images are very big. This makes that the time used to process these images are long. The color camera is the most common, and it can provide color, geometric, and texture information. It was widely used in the fruit image segmentation [13]. Some researchers use threshold segmentation to segment fruit that are significantly In this research, we developed an apple image segmentation algorithm for the robot in the field. We considered the combination of color and texture features with machine learning to provide a classifier. In addition, apple fruit are segmented from others from the images with this classifier. Such an apple fruit segmentation operation calls the integration of several image processing approaches. Therefore the objectives of this research were: 1. To assess and optimize the suitability of color features, texture features for apple fruit image segmentation. 2. To develop an apple fruit pixel classifier based on machine learning to segmentation images.

Apple Orchard Image Capture
Images of apple orchard were taken from 20:00 to 22:00 on 23 September 2018, at the Beijing International Urban Agricultural Science and Technology Park, Beijing, China (116 47 57 E, 39 52 7 N). The camera was NIKON D300S, with the original resolution of 4032 × 3016. This study aimed to design an algorithm for a mobile work platform. It was needed to limit the computational resources that the algorithm consumes. Image resolution was reduced to 400 × 300 to reduce the computational resources. Apple fruit were aimed to be segmented from leaves and sky. In total, 105 images were provided. They were randomly divided into 2 groups. One group contained 5 images that were applied for algorithm development. The other group contained 100 images that were used for algorithm assessment. Some of the original images are shown in Figure 1. Among the five images for algorithm development, 500 pixels were randomly selected on each image. These pixels were then manually labeled as apple fruit pixels and other pixels. Figure 2 showed some of the labeled pixels on the image. In the random sampling results, there ere fewer apple pixels and more other pixels, so it was necessary to downsample. One hundred apple fruit pixels and 100 other pixels were randomly selected from the labeled pixels. Among these 200 pixels, 100 pixels are randomly selected as the training set and 100 pixels as the test set.
Apple fruit pixels Non-apple fruit pixels

General Steps of the Apple Fruit Segmentation Algorithm
The segmentation algorithm was composed of two general steps including: (1)color features selection and texture observers optimization; (2)pixels classifier based on machine learning development, segmentation by pixel classification with classifier. Figure 3 shows the summarised flowchart of image segmentation strategy using the proposed algorithm. OpenCVpyhon (3.4,2), numpy(1.13.3), scikitlearn(0.20.0), scikit image(0.13.0) were used to analyse the data.

Apple Fruit Color Features Extraction
In this paper, systematic experimentation was used to identify the acceptable color feature that best fits this apple fruit segmentation algorithm. Several frequently used color spaces were employed in this research, including RGB, HSV, XYZ, LAB, HED, YUV, YIQ color spaces. The gray-value of each raw channel of each color space was extracted as a color feature. In total, there were 21 color features obtained. They were R, G, B, H, S, V, X, Y, Z, L, A, B.1 (the B channel of LAB color space), H.1 (the H channel of HED color space), E, D, Y.1 (the Y channel of YUV color space), U, V.1 (the V channel of YUV color space), Y.2 (the Y channel of YIQ color space), I, Q. After that, correlation analysis and chi-square tests were used to select effective features and redundant features. Pearson Correlation Coefficient (PCC) of each color feature group combination was calculated. A chi-square test was used to analyze the correlation between features and the classification target. When the PCC between two features was higher than or equal to 0.8, the feature of high p-value from the chi-square test was deleted. The rest of the features was used for classifier development.

Apple Fruit Texture Features Extraction
The grey-level co-occurrence matrix (GLCM) is one of the most popular statistical approaches used in texture discrimination [38]. A unique co-occurrence matrix exists for each spatial relationship. The calculation of textures is dependent upon the direction (D) and the orientation (O) [39]. Therefore, D and O parameters of texture observer were optimized by the grid search method. The search range of D was [0,19] and O was [0,360] • (resolution is 10 • ). The GLCM was quite complex, and some characteristic values of texture features are usually used as texture features. In this paper, characteristic values used include contrast, dissimilarity, homogeneity, ASM, energy, correlation. The calculation formulas are shown in Equations (1)- (6). After texture features extraction, Correlation Coefficient analyses and the chi-square test were also used to select effective features.
where i is the row number; j is the column number; P i,j is the normalized value in the cell i, j; N is the number of rows or columns, µ is the mean value, σ is the variance.

Data Normalization and Dimension Reduction
The data after obtaining should be scaled to a reasonable range, and transferred to a non-dimensional data, for the features extracted in this paper contained multiple dimensions and were not steady. Normalization formula was used to ensure that each dimension of the data ranges from zero to one [40]. Data normalization was calculated by Equation (7).
where Y is the normalized value, X is the original value, X min is the minimum value and X max is the maximum value. Selected features contained both useful and irrelevant information for the apple fruit pixels identification. Also, it was necessary to do dimensionality reduction to reduce computer resource-consuming [39]. Principal Components Analysis (PCA) was used to decompose a multivariate dataset in a set of successive orthogonal components that explain a maximum amount of the variance. In scikit-learn, PCA is implemented as a transformer object that learns components in its fit method and can be used on new data to project it on this components [41]. To retain more information while dimension reduction, two were selected as the number of components [42].

Classifier Development and Pixels Classification
The classification was one of the main components of a segmentation algorithm. For this reason, the classifier should be selected carefully [43]. In this research, nine machine learning algorithms had been used to develop pixel classifiers. They were Nearest Neighbors, Linear Support Vector Machine (Linear SVM), Radial Basis Function Support Vector Machine (RBF SVM), Gaussian Process, Decision Tree, Random Forest, Neural Net, AdaBoost, Naive Bayes, Quadratic Discriminant Analysis (QDA) [41].
One hundred samples were used to train these classifiers, and one hundred samples were used to test these classifiers.The accuracy of the training set and testing set was calculated respectively to test the effect of classifier and avoid over-fitting. The accuracy calculation method is shown in (8). True Positive Rate (TPR) was used to evaluate classifier performance. The calculation formulas are shown in (9).
where TruePositive (TP) is the number of pixels detected as apple fruit pixels correctly. TrueNegative (TN) is the number of pixels detected as others correctly. FalsePositive (FP) is the number of other pixels detected as apple fruit pixels, FalseNegative (FN) is the number of apple fruit pixels detected as others.

Apple Fruit Segmentation Result Test
To test the developed algorithm, pixels in 100 images were manually labeled as either apple fruit pixels or other. The manually labeled pixels were seen as ground-truth. and the algorithm evaluation result was seen as predict result. The image segmentation effect of the algorithm was evaluated by the total number of manually labeled apple fruit pixels and the total number of apple fruit pixels predicted by the algorithm. Finally, the experimental results were evaluated by the segmentation error A f , false positive rate FPR and false negative rate FNR . These three criteria were calculated by Equations (10)- (12).
where A 1 represents the real area of fruit target; A 2 the fruit area acquired after segmentation; A 1 the complementary set of A 1 ; FPR the percentage of the pixel belonging to the background that is mistakenly segmented as fruit pixel in algorithm segmentation; FNR the percentage of the pixel belonging to fruit that being mistakenly segmented as background pixel in algorithm segmentation. The smaller the values of A f , FPR and FNR are, the better of segmentation effectiveness and higher accuracy will be.

Color Features Selection Result
The chi-square test p-value of color features is shown in Table 1. There were some p-value of the color channel was lower than 0.05. This meant that these channels could significantly distinguish apple fruit pixels from non-apple fruit pixels. These channels were R, G, S, V, X, Y, L, A, B.1, H.1, Y.1, U, Y.2, I, Q. There was also some p-value of the color channel was higher than 0.05. This meant that these channels could not significantly distinguish apple fruit pixels from non-apple fruit pixels. These channels were B, H, Z, E, D, V.1. By observing the correlation matrix heat map (Figure 4), the features with a correlation coefficient greater than 0.8 were divided into one group. In total, there were 6 groups. The first group was R, G, B, V, X, Y, L, A, B.1, H.1, Y.1, U, Y.2, I, Q. The second group was the H channel. The third group was the S channel. The 4th group was the E channel. The 5th group was the D channel. The 5th group was the Y.1 channel. According to the chi-square test, the p-value of H, E, D channel were higher than 0.05. They were unable to distinguish the pixels of the apple fruit from the pixels of others. They have not been used in the segmentation algorithm. There were 15 channels in the first group. However, they were significantly correlated. They contained much of the same information. It was not necessary to use all of them in the algorithm. The channel with the lowest p-value, the B.1 channel (8.57 × 10 −09 ), was selected to develop the segmentation algorithm. Finally, the B.1, S, and Y.1 channel was selected as the color features to build the segmentation algorithm. The boxplot of selected color features is shown in Figure 5. In the three channels, the mean value of Apple fruit pixels was lower and that of non-apple fruit pixels was higher. The quartiles of the two kind of samples had no intersection. Most of the Apple fruit pixels and Non-apple fruit pixels samples had different values in these channels. However, there was overlap between the lower edge of the non-apple fruit pixels sample set and the upper edge of apple fruit pixels, which indicates that the values of a small number of samples in this channel were the same.This might be due to the fact that the colors of Apple fruit and leaves were relatively similar, so the difference between the two types of samples in color characteristics was not large. At the same time, in the three channels, the dispersion degree of Apple fruit pixels samples was relatively small, and that of non-apple fruit pixels samples was relatively large. This was because there was only one class of Apple fruit pixels samples, but non-apple fruit pixels was composed of two different types of samples, sky and leaves, so it was relatively dispersed. In the three channels, there were some samples that cannot be effectively distinguished, so it was impossible to distinguish the two types of samples by color features alone, and they need to be combined with other features.

Texture Features Selection Result
Chi-square test results of texture features extracted from GLCM of different distance and orientation are shown in Figure 6. It could be seen from the picture that the minimum p-value of Contrast appeared at (3,12). The minimum p-value of dissimilarity appeared at 7 positions, and the first one was (0,3). The minimum p-value of dissimilarity appeared at 7 positions, and the first one was (0,3). The minimum p-value of homogeneity appeared at 2 positions, and the first one was (19,3). The minimum p-value of ASM appeared at 14 positions, and the first one was (2,3). The minimum p-value of energy appeared at 14 positions, and the first one was (2,3). The minimum p-value of correlation appeared at 36 positions, and the first one was (0,0). ASM and energy minimum p-value was lower than 0.05, which can significantly distinguish the target. Results of texture feature correlation analysis are shown in Figure 7. In addition, the result of the texture feature chi-square test is shown in Table 2 The correlation between the ASM and energy was more than 0.8. The correlation coefficient between other features was no more than 0.8, indicating that the texture features contained less repeated information. This suggested that ASM and energy contained much of the same information. It was necessary to remove one of them. The energy had a lower P-value than ASM, so the ASM was removed. The energy was selected to build the segmentation algorithm.  Texture features were very poor in differentiating the expressions of apple fruit pixels. In the optimization of distance and orientation, the relationship of the distance and performance was that the larger the distance, the worse the performance. The information about the orientation of the performance was not significant. The minimum p-value of many texture features appears in many orientation points. Even all the p-values of orientation were the same. At the same time, the lowest p-value mostly appears in the position with a small distance. Even the lowest p-value appeared at (0,0). This suggested that the larger the GLCM, the worse the effect. It showed that the influence of adjacent pixels was very weak, even causing interference to the classification. This might be due to the smooth surface of the apple fruit. There was also no special texture. Therefore, it was difficult for GLCM to extract the features that distinguish apple fruit pixels effectively.

Apple Fruit Pixels Classification Result
Sample distribution after dimension reduction analysis of PCA is shown in Figure 8. The apple fruit pixels samples were relatively concentrated in the principal component space. This was because apple fruit pixels were all derived from apple fruit. The non-apple fruit pixels samples were dispersed in the principal component space. This was due to the complex origin of non-apple fruit pixels, including sky and leaves. In addition, leaves contained two kinds of, front and back. The two types of samples were interlaced in the principal component space. This would cause a classification error. The two types of samples could not perfectly linearly be separated in the principal component space. Therefore, more complex classifiers were needed to be designed.
Different apple fruit pixels classifiers development result is shown in Figure 9. The Nearest Neighbors algorithm could better classify apple fruit pixels and non-apple fruit pixels. The classification boundaries were also clear. The classifier obtained by the Linear SVM algorithm would perform Linear segmentation of the two types of samples in the solution space. However, because these two types of samples were not completely linearly separable in the solution space. The effect of this classifier was poor. The classifier based on the RBF SVM algorithm used RBF kernel function, so it has a good performance on linearly indivisible samples. The classifier obtained by the Gaussian Process could solve the nonlinear separable problem well. It was sensitive to the density of sample points. In the region of solution space with less sample distribution, it had no classification bias. The Decision Tree algorithm adopted multiple straight lines in the solution space. It made up for the problem of non -linear sample classification by multiple linear classifications. It had a very distinct classification boundary. The Random Forest algorithm improved the accuracy by integrating multiple Decision Tree algorithms. However, it could not be classified with curves either. It was also classified by multiple lines. Neural Net algorithm can only carry out linear classification due to the small number of neurons in the hidden layer. The AdaBoost algorithm also classified by composite lines. Naive Bayes was also a very good curve classifier and has achieved a very good classification effect. The results of the classifier classification in the training set and test set are shown in Table 3. The classifiers based on Nearest Neighbors, Decision Tree, and Random Forest, the classifier had achieved high classification accuracy. Their accuracy was 0.94, 0.95, and 0.94, respectively. However, the classification accuracy of the classifiers based on the Decision Tree and Nearest Neighbors algorithm in the testing set was significantly lower than that in the training set. These classifiers were overfitting. The accuracy of the classifier training set and testing set based on the Random Forest algorithm is 0.94. It was not overfitting, and its classification accuracy was relatively high. Meanwhile, it can be seen that the TPR of the Random Forest algorithm was 0.90 in the test set. This indicated that the algorithm classified some other pixels as apple pixels. This was because some of the leaf pixels were similar in color and texture to apples. The TPR value of the Random Forest algorithm in the test set was also higher than that of other algorithms. Therefore, the classifier of the Random Forest was selected to classify the pixels in this study.

Apple Fruit Image Segmentation Result
We compared the segmentation results of the proposed method with three other segmentation algorithms. These were Otsu based on R-B and boundary object removal [1], K-means cluster segmentation method based on R-B , and adaptive threshold segmentation method based on R-B . The results are shown in Table 4. Calculated by the designed segmentation method in the work, the average values of Af, FPR and FNR were 0.07, 0.13 and 0.15, respectively. Calculated by the Otsu based on R-B and boundary object removal, the average values of Af, FPR and FNR were 0.26, 0.09 and 0.34, respectively. When calculated by the K-means cluster segmentation method based on R-B, the average values of Af, FPR and FNR were 0.29, 0.28 and 0.18, respectively. When calculated by the auto-adaptive threshold segmentation algorithm based on R-B, the average values of Af, FPR and FNR were 0.35, 0.39 and 0.14, respectively. It can be seen that the algorithm proposed in this paper has a good effect on Af, FPR and FNR. It could be seen from Figure 10 that the method based on Otsu and boundary object removal had good results. However, some apple pixels were classified as others. Therefore, the FPR value of the segmentation results obtained by this algorithm was low, but the FNR value was high. The Adaptive Threshold segmentation method based on the R-B segmentation method had poor segmentation results. The error mainly comes from the fact that some other pixels were classified as Apple pixels. Therefore, the FPR value of the segmentation results of this algorithm was higher, while the FNR value was lower. The R-B segmentation method based on the K-means Cluster segmentation method also misdivided some other pixel points into Apple pixel points. The results of the segmentation algorithm presented in this paper had a good performance in each evaluation index.

Discussion
As can be seen from Figure 10, the method of Otsu based on R-B and boundary object removal could effectively segment part of apple and background. However, this algorithm segment some apple pixels into others when the light was dark. At the same time, this algorithm removed some small areas of apple. So part of the apple was not segmented correct. Both the k-means algorithm and the adaptive threshold algorithm had errors in the segmenting part of leaf pixels as apple pixels. This was mainly because the color of the underside of some leaves was similar to apples in a dark environment, and it was difficult to segment them simply by color. The algorithm proposed in this paper combined color and texture features, which could complement each other. Texture features played an important role in separating leaves and apples. Therefore, the algorithm proposed in this paper could segment apples better.
The image segmentation method based on deep learning has achieved good results in fruit recognition in the orchard environment [44,45]. Because the deep learning algorithm can have strong adaptability. However, these algorithms are very complex. Lots of labeled pictures are needed during training. The image label task for the semantic segmentation task is the pixel by pixel level. Such tasks require a lot of human labor. The structure of a deep learning network is complex, so it needs to run on the platform with rich computing resources. Due to the limitations of space and energy consumption, these mobile platforms are equipped with very limited computing resources, and the current deep learning algorithms are not suitable for deployment on the robot. The algorithm studied in this paper, based on meeting the needs of orchard robot work, reduced the computational resource burden to the greatest extent. It is an algorithm suitable for orchard robots.

Conclusions
In this paper, apple images from orchards were gathered by a camera. Color and texture features were used to build pixels classifiers. In addition, nine classifiers based on different machine learning algorithms were built to classify apple fruit pixels from other pixels. In addition, the apple segmentation method was obtained. Through the analysis of the experimental results, it was found that: (1) Color features could effectively distinguish apple fruit pixels from others, while texture features had a poor performance in this; (2) The classification algorithm based on Random Forest could effectively classify the apple fruit pixels, and the accuracy was 0.94 (3) Image segmentation can be done through pixel classification. The average values of Af, FPR and FNR were 0.07, 0.13 and 0.15, respectively. (4) The image segmentation model established by pixel classification could effectively segment apple fruit from photos.