This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
Automatic classification of fruits via computer vision is still a complicated task due to the various properties of numerous types of fruits. We propose a novel classification method based on a multiclass kernel support vector machine (kSVM) with the desirable goal of accurate and fast classification of fruits. First, fruit images were acquired by a digital camera, and then the background of each image was removed by a splitandmerge algorithm; Second, the color histogram, texture and shape features of each fruit image were extracted to compose a feature space; Third, principal component analysis (PCA) was used to reduce the dimensions of feature space; Finally, three kinds of multiclass SVMs were constructed,
Recognizing different kinds of vegetables and fruits is a difficult task in supermarkets, since the cashier must point out the categories of a particular fruit to determine its price. The use of barcodes has mostly ended this problem for packaged products but given that most consumers want to pick their products, they cannot be prepackaged, and thus must be weighed. A solution is issuing codes for every fruit, but the memorization is problematic leading to pricing errors. Another solution is to issue the cashier an inventory with pictures and codes, however, flipping over the booklet is time consuming [
Some alternatives were proposed to address the problem. VeggieVision was the first supermarket produce recognition system consisting of an integrated scale and image system with a userfriendly interface [
The aforementioned techniques may have one or several of the following shortcomings: (1) they need extra sensors such as a gas sensor, invisible light sensor, and weight sensor. (2) The classifier is not suitable to all fruits,
Support Vector Machines (SVMs) are stateoftheart classification methods based on machine learning theory [
In this paper, we chose an image recognition method which only needs a digital camera. To improve the recognition performance, we proposed a hybrid feature extraction technique which combines the color features, Unser's texture, and shape features, followed by the principal component analysis (PCA) to reduce the number of features. Three different multiclass SVMs were used for multiclass classification. We expect this method will solve the fruit classification problem.
The rest of the paper is organized as follows: Section 2 discusses the methods used in this paper. Section 2.1 shows the splitandmerge algorithm for fruits extraction; Section 2.2 gives the descriptors of fruits with respect to the color component, shape component, and texture component. In addition, PCA was introduced as a methodology to reduce the number of features used by the classifiers; Section 2.3 introduced in the kernel SVM, and then gives three schemes for multiclass SVMs, including WinnerTakeAll SVM (WTASVM), MaxWinsVoting (MWVSVM), and Directed Acyclic Graph SVM (DAGSVM); Section 3 shows the use of 1,653 images of 18 different types of fruits to test our method; and lastly Section 4 is devoted to conclusions.
First, we use image segmentation techniques to remove the background area since our research only focuses on the fruits. We choose a splitandmerge algorithm, which is based on a quadtree partition of an image. This method starts at the root of the tree that represents the whole image. If it is found inhomogeneous, then it is split into four sonsquares (the splitting process), and so on so forth. Conversely, if four sonsquares are homogeneous, they can be merged as several connected components (the merging process). The node in the tree is a segmented node. This process continues recursively until no further splits or merges are possible.
We propose a hybrid classification system based on color, texture, and appearance features of fruits. Here, we suppose the fruit images have been extracted by splitandmerge segmentation algorithm [
At present, the color histogram is employed to represent the distribution of colors in an image [
For monochromatic images, the set of possible color values is sufficiently small that each of those colors may be placed on a single range; then the histogram is merely the count of pixels that have each possible color. For color images using RGB space, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values.
The histogram provides a compact summarization of the distribution of data in an image. The color histogram of an image is relatively invariant with translation and rotation about the viewing axis. By comparing histograms signatures of two images and matching the color content of one image with the other, the color histogram is well suited for the problem of recognizing an object of unknown position and rotation within a scene.
Gray level cooccurrence matrix and local binary pattern are good texture descriptors, however, they are excessively time consuming. In this paper, we chose the Unser feature vector. Unser proved that the sum and difference of two random variables with same variances are decorrelated and the principal axes of their associated joint probability function are defined. Therefore, we use the sum
The nonnormalized sum and difference associated with a relative displacement (
The sum and difference histograms over the domain
Next, seven indexes can be defined on the basis of the sum and difference histogram. Those indexes and their corresponding formulas are listed in
In this paper we propose eight measures based on mathematical morphology, which are listed in
In total, there are 79 features (64 color features + seven texture features + eight shape features) extracted from a given image. Excessive features increase computation time and storage memory, which sometimes causes the classification process to become more complicated and even decrease the performance of the classifier. A strategy is necessary to reduce the number of features used in classification.
Principal component analysis (PCA) is an efficient tool to reduce the dimensionality of a data set consisting of a large number of interrelated variables while retaining the most significant variations [
It should be noted that the input vectors should be normalized to have zero mean and unity variance before performing PCA, which is shown in
Traditional linear SVMs cannot separate complicated distributed practical data. In order to generalize it to nonlinear hyperplane, the kernel trick is applied to SVMs [
SVMs were originally designed for binary classification. Several methods have been proposed for multiclass SVMs, and the dominant approach is to reduce the single multiclass problem into multiple binary classification problems [
Assume there are totally
The mathematical model is described as follow. Given a
If
For the oneversusone approach, classification is done by a maxwins voting (MWV) strategy [
The mathematical model is described as follow. The
If
A Directed Acyclic Graph (DAG) is a graph whose edges have an orientation and no cycles. The DAGSVM constructs the individual SVM as the MWVSVM, however, the output of each individual SVM is explained differently. When
The experiments were carried out on a P4 IBM platform with 3 GHz main frequency and 2 GB memory running under the Microsoft Windows XP operating system. The algorithm was developed inhouse on the Matlab 2011b (The Mathworks©) platform. The programs can be run or tested on any computer platforms where Matlab is available.
Below are the steps explaining the flowchart of the proposed fruit recognition system shown in
Numbers in the figure are achieved by experiments below:
The fruit dataset was obtained after six months of onsite data collecting via digital camera and online collecting using images.google.com as the main search engine. The splitandmerge algorithm was used to remove the background areas; later images were cropped to leave the fruit in the center of the image, and finally downsampled to 256 × 256 in size.
The data set comprises 18 different categories: Granny Smith Apples (64), Rome Apples (83), Yellow Bananas (132), Green Plantains (61), Tangerines (112), Hass Avocados (105), Watermelons (72), Cantaloupes (129), Gold Pineapples (89), Passion Fruits (72), Bosc Pears (88), Anjou Pears (140), Green Grapes (74), Red Grapes (45), Black Grapes (122), Blackberries (97), Blueberries (95), and Strawberries (73). In total, there are 1s653 images. Table 4 depicts the samples of different types of fruits in the data set.
Typically, a statistical model that deals with the inherent data variability is inferred from the database (
Cross validation methods are usually employed to assess the statistical relevance of the classifiers. It consists of four types: Random subsampling,
The
Another challenge was to determine the number of folds. If
The curve of cumulative sum of variance versus number of reduced vectors via PCA is shown in
The loading plot on principal component 1&2 is shown in
We tested three multiclass SVMs (WTASVM, MWVSVM, and DAGSVM) using the reduced feature vectors and using 5fold cross validation. Meanwhile, we chose the Linear (LIN) kernel,
As for the classification speed, the WTASVM is the slowest, for it uses one
The confusion matrix of GRB kernel MWVSVM is shown in
By looking at the bottom line of
Notwithstanding, a few types of fruits were not recognized so successfully. The SVM for the 11th class (Bosc Pears) performs the worst. In the test dataset, there are 18 different pictures of Bosc Pears, however, three of them are misclassified as 6th class (Avocado), and another three of them are misclassified as 10th class (Passion Fruits), so the rest are recognized correctly leading to a 66.7% (12/18) success rate. For the 6th class (Avocado), it has twentyone samples in the test dataset, but four are misclassified as 10th class (Passion Fruits) and two are misclassified as 11th class (Bosc Pears), so with the rest fifteen samples recognized, this give us a 71.43% (15/21) success rate. For the 10th class (Passion Fruits), it has fourteen samples in the test dataset, two are misclassified as 6th class (Avocado), and another two are misclassified as 11th class (Bosc Pears), so with the rest recognized, it give us a 71.43% (10/14) success rate. In other words, the 6th, 10th, and 11th classes are not distinct in the view of SVM, which is a motivation for our future work in order to solve this misclassification.
This work proposed a novel classification method based on multiclass kSVM. The experimental results demonstrated that the MWVSVM with GRB kernel achieves the best classification accuracy of 88.2%. The combination of color histogram, Unser's texture, and shape features are more effective than any single kind of feature in classification of fruits. Using PCA to reduce features, we tested three different multiclass SVMs (WTASVM, MWVSVM, and DAGSVM) with linear kernel,
The authors give special thanks to Roberto Lam from The City College of New YorkCUNY for the review of the whole paper.
Comparison of Otsu's Method with splitandmerge segmentation.
Rainbow image.
Illustration of the morphology measures.
Using normalization before PCA.
The DAG for finding best class out of six classes.
The flowchart of the proposed fruit recognition system.
A 5fold Cross Validation.
Feature selection via PCA (threshold is set as 95%).
The biplot (red dots represent the samples, and blue lines represent the 79 original features, and xaxis & yaxis represent the two principal components).
Confusion matrix of GRB kernel MWVSVM with overall classification accuracy of 88.2%.
Sum and difference histogram based measures.
Measure  Formula 

Mean ( 

Contrast ( 

Homogeneity ( 

Energy ( 

Variance ( 

Correlation ( 

Entropy ( 
The Morphology based Measures.
Measure  Meaning 

Area ( 
The actual number of pixels inside the object 
Perimeter ( 
The distance around the boundary of the object 
Euler ( 
The Euler number of the object 
Convex ( 
The number of pixels of the convex hull 
Solidity ( 
The proportion of area to convex hull 
MinorLength ( 
The length of the minor axis of the ellipse 
MajorLength ( 
The length of the major axis of the ellipse 
Eccentricity ( 
The eccentricity of the ellipse 
Four Common Kernels.
Name  Formula  Parameter(s) 

Homogeneous Polynomial (HPOL)  
Inhomogeneous Polynomial  
Gaussian Radial Basis (GRB)  
Hyperbolic Tangent 
Samples of

The cumulative variances of PCAtransformed new features.
Dimensions  1  2  3  4  5  6  7  8  9  10 

Classification Accuracy of SVMs.
LIN  HPOL  GRB  

48.1%  61.7%  55.4%  
53.5%  75.6%  
53.5%  70.1%  84.0% 
Computation Time of SVMs (s).
LIN  HPOL  GRB  

8.439  9.248  11.522  
1.680  1.732  1.917  
0.489  0.403  0.563 