Automatic Detection System of Olive Trees Using Improved K-Means Algorithm

: Olive cultivation over the past few years has spread across Mediterranean countries with Spain being the world’s largest olive producer among them. Because olives are a major part of the economy for such countries keeping records of their tree count and crop yield is of high signiﬁcance. Manual counting of trees over such large areas is humanly infeasible. To address this problem, we propose an automatic method for the detection and enumeration of olive trees. The algorithm is a multi-step classiﬁcation system comprising pre-processing, image segmentation, feature extraction, and classiﬁcation. RGB satellite images were acquired from the Spanish territory and pre-processed to suppress the additive noise. The region of interest was then segmented from the pre-processed images using K-Means segmentation, through which statistical features were extracted and classiﬁed. Promising results were achieved for all classiﬁers, namely Naive Bayesian, Support Vector Machines (SVMs), Random Forest and Multi-Layer Perceptrons (MLPs), at various division ratios of data samples. In a comparison of all the classiﬁcation algorithms, Random Forest outperformed the rest by an overall accuracy of 97.5% at the division ratio of 70 to 30 for training to testing. Author Contributions: Conceptualization, M.W., T.-W.U., and Aftab Khan; methodology, M.W., T.-W.U., and A.K.; software, M.W. and U.K.; validation, T.-W.U. and A.K.; formal analysis, T.-W.U. and A.K.; investigation, M.W. and U.K.; resources, T.-W.U. and A.K.; data curation, M.W. and U.K.; writing—original draft preparation, M.W. and A.K.; writing—review and editing, M.W., T.-W.U., and Aftab Khan; visualization, M.W., T.-W.U., and A.K.; project administration, and funding acquisition, T.-W.U..


Introduction
Olive fruit possesses high agricultural significance, being a major part of the economy for countries such as Spain, Italy, Greece, and Turkey. Today, Spain is the leader of olive production, producing 5,276,899 metric tons of olives on over 2.4 million hectares of dedicated land. Over the last 25 years, consumption of olive oil has increased by 73% and is anticipated to exceed the production by 13% this year [1].
The production and distribution of economically significant crops need to be recorded and maintained for both agriculturists and economists. Manual collection of data over large areas is humanly infeasible, time-consuming, and prone to human error. Advancements in the field of image processing and the availability of very high resolution (VHR) imagery has led to automatic detection and counting [2], which aim to achieve the above-mentioned goal.
Automatic detection of olive trees has remained a challenging topic for researchers. Various basic image pre-processing techniques including image segmentation [3], blob detection [4][5][6], and template matching [7] have been devised for accurate remote detection. Similarly, complex techniques with artificial intelligence enabled [8] and classification-based systems has also been proposed [9,10] to achieve accurate and confident detection results.
Previous methods showed promising results, while leaving room for the application of accurate and heuristic segmentation techniques along with the development of an accurate yet computationally efficient multi-staged olive tree classification system. In addition, these algorithms have been tested over less demanding environments, characterized by fewer ground classes captured over fewer sample images. Our proposed system aims to achieve accurate detection and classification of olive trees in highly diverse environment captured over a large set of aerial images, addressing the limitations in the previous work. It contributes to the domain knowledge by: • utilizing the heuristic based, improved K-Means clustering algorithm for better segmentation results; • developing a computationally efficient and robust multi-step based classification model for accurate detection and identification of olive trees; and • training and testing the proposed system over a large set of diverse images with varying ground information.
The rest of the paper is organized as follows. Section 4.2 presents the existing literature on olive tree detection followed by the methodology discussed in Section 3. Section 4 covers the experimental setup followed by results in Section 5. Section 6 concludes the paper along with the discussion of future work.

Early 1990s and 2000s
Starting in 1990, Karantzalos and Argialas proposed a blob detection based method to detect olive trees in satellite imagery acquired from Quickbird and IKONOS [5]. In 2000, the Joint Research Centre (JRC) developed a tool called OLICOUNT to count olive trees in grey-scale input image [11]. The tool utilized a combination of techniques such as thresholding, region growing, and morphological operations. Advancements in the tools were made, resulting in OLICOUNT v2 with 16-bit image support [12].

Late 2000s
Gonzales et al. in 2007 developed a probabilistic model to count the olive trees [7] over the imagery acquired through QUICKBIRD satellite. The probability was calculated for a tree if it was a part of a reticle along with exhibiting the geometrical features such as size, shape, and the angle formed among the trees. The technique resulted in detection accuracy of 98%. In 2009, Arbor Crown Enumerator (ACE), the algorithm proposed by Ionis et al., detected olive trees in multi-spectral imagery [6]. The algorithm utilized the red band thresholding along with the NDVI-based detection method, resulting in an overall estimation error of 1.3%. In the same year, a classification model was proposed by Yakoub et al. to detect olive trees in the agricultural area of Al Jouf, Saudi Arabia acquired by IKONOS-2 [9]. The method utilized a Gaussian Process Classifier (GPC) classifying morphological features of ground data with an overall accuracy of 96%.

2010 to Present
In 2010, Garcia et al. proposed multiple methods to detect olive trees from satellite acquired from the SIGPAC viewer of Ministry of Environment and Rural and Marine Affairs, Spain (http: //sigpac.mapa.es/fega/visor/) images [3]. Testing samples were formed from the SIGPAC satellite viewer. In one of the methods, they detected olive trees by extracting them as segments formed by K-Means clustering. The results showed an overall omission rate of zero in six samples and a commission rate of one in six. Other methods utilized fuzzy logic, generating a fuzzy number to detect the olive trees using k-neighbor approach [8]. Promising results were obtained from the methodology showing an omission rate of one in six and a commission rate of zero. The results were generated using the value of k as 1 and 2.
In 2011, an object-based classification method was proposed by Jan Peters et al. to detect olive trees from multi-spectral images covering the region of France [10]. The method was comprised of a four-step model: image segmentation, feature extraction, classification, and result mapping. Synergy models were developed at each stage of the technique by combining features from various sensors giving an overall accuracy of 84.3%.
In 2017, Chemin et al. [4] proposed a method to monitor the massive loss of olive trees due to the deadly pathogen, Xylella Fastidiosa over the region of Apuglia, Italy [13]. Multi-spectral images were pre-processed to NDVI followed by segmentation based on Niblack's thresholding method and Sauvola binarization [14]. Segments falling within defined parameters of size and area were considered to be olive trees, resulting in an overall mean error of 13%. In 2018, Khan et al. [15] proposed a computationally efficient method to detect olive trees over the territory of Spain. They employed basic image processing techniques such as unsharp masking and threshold-based segmentation to detect and count olive trees. Segmented trees were made part of the tree count if lying within the possible size range. The algorithm showed an overall accuracy of 96%.
Related work done in the past years show the proposition of various techniques and methods to detect olive trees. Techniques from simple image segmentation and blob detection to complex methods of classification have been proposed. It has been observed that all previous techniques in the literature showed high accuracy but with a few limitations. Application of simple and efficient threshold-based segmentation techniques along with blob-detection-based methods gave decently accurate results but were highly prone to false positives. Sequential application of the above-mentioned techniques over multi-spectral images enhanced detection accuracy with increase in computational cost. In addition, the respective techniques also showed limitations towards omission and commission errors. When it comes to classification based systems, publicly available datasets with enough images covering diverse cases were not made part of the testing images, leaving room for improvement in the classification results. Our proposed technique focuses on overcoming these shortcomings along with validating it over a diverse dataset in terms of both number of images and ground cover classes.

Proposed Scheme of Automatic Detection of Olive Trees
In this paper, a method for detecting olive trees in plantation areas using classification is proposed. The aim was to design and develop an olive tree detection algorithm that is accurate in prediction, able to handle many image data, scalable to multi-spectral imagery, and robust in producing accurate results in varying land/tree scenarios in the imagery. The multi-step algorithm utilizes a combination of techniques: pre-processing, segmentation, feature extraction, and classification. The workflow diagram of our proposed system is shown in Figure 1.

Image Pre-Processing
Image pre-processing is the initial step of our algorithm in which the colored images undergo the noise removal and any other irregularities obscuring the desired information. During the formation of the images, they may encounter errors due to low luminance, motion blur, and mechanical noise added due to optical devices. Input images are pre-processed, removing such errors by smoothing the effect of noise followed by edge enhancement for better results in later stages [16].

Image Enhancement Using Laplacian of Gaussian (LoG) Filtering
Laplacian filter is used to highlight the regions showing abrupt changes in intensity levels resulting in enhanced edges of the image [17]. Considering the sensitivity of the Laplacian filter to noise, the Gaussian filter is used as a smoothing operator to normalize the noise within the image [18]. The 2D Gaussian is given in Equation (1), where σ is the standard deviation. The convolution result of the Gaussian filter with the input image im(x,y) is given in Equation (2) as, where * represents the convolution operator and L(x,y) is the Gaussian scale space representation of the input image im(x,y). The Laplacian is given in Equation (3) as, where ∂ is the spatial derivative of the filtered image L in both x and y axis. Gaussian smoothing followed by the Laplacian can be combined using a single operator known as Laplacian of Gaussian (LoG) [19], which is shown in Equation (4) as, where σ represents the standard deviation and x and y represent the spatial coordinates of the image. The application of Gaussian filter before the Laplacian attenuates the noise, thus improving the performance of the Laplacian operator of enhancement of edges.

Image Segmentation Using K-Means Clustering
The region of interest (ROI) is defined as the subset of pixels within an image requiring further operations to be performed. To extract foreground information of olive trees as the ROI from the image, image segmentation technique is used. Various segmentation techniques can be used, out of which K-Means clustering is performed.

K-Means Clustering
K-Means clustering is a type of unsupervised learning that divides unlabeled data into non-overlapping groups [20]. The algorithm performs the iterative assignment of each data point to one of the K groups based on the least distance of centroids to their feature space. The process is repeated until centroids reach to their final constant positions. For the input of K number of clusters and respective centroids, data points representing olive trees are clustered and extracted. The flow diagram of the process is shown in Figure 2.

Centroid Selection
The centroid is a key data point around which clustering is performed. It is similar to any other data point represented with feature vector selected randomly or through a mechanism. In the proposed methodology, K numbers of centroids are selected under a mechanism [21] for better clustering, as it affects speed and performance, which is given mathematically as given in Equation (5), where m is the maximum intensity value of the image determined from the histogram, k is the number of clusters and C i is the ith cluster centroid where i takes the value of 1, 2, 3, k.

Feature Extraction
The segments extracted as a result of clustering may include both olive trees and other ground components due to similar intensity levels. Features from those individual segments are extracted and a feature vector is formed. Olive trees when viewed from above show resemblance towards blob-like structures exhibiting distinct characteristics of size and color. The features extracted are grouped together forming a single feature vector. The statistical feature vector is calculated for each of the extracted foreground segment. Table 1 lists the mentioned features along with the combined vector of those features.

Classification
Classification is one of most widely used technique in machine learning to predict the output belonging to a class label y based on input features x. Features extracted in the previous section are used to train and test multiple supervised learning algorithms. These include Naive Bayes, Support Vector Machine, Multi-layer Perceptron, and Random Forests.

Naive Bayes Classifier
Naive Bayes (NB) is a supervised learning technique based on the Bayes theorem [22]. It works on a naive assumption of independence among the features relating the conditional and marginal probabilities that two events occurred randomly. For an input x =(x1, x2, x3, xd), a d-dimensional feature vector with no output class label, the algorithm predicts the class based on the Bayes theorem. Let C be a class variable with class labels as C j with j = 1,2,3..k. P(C j ) is the prior probability of class C j . P(x − C j ) is the likelihood of the object belonging to the class C j and P(x) is the prior probability of the predictor. The posterior probability of a class C j given the predictor x as P(C j − x) as shown in Equation (6) as, In the above equation, the class C j is assigned to the input x based on the highest probability among all the classes. The independence of features among one another is given in Equation (7), The Naive Bayes classifier is based on the above equations and its naive assumption leads to simpler calculation and faster data processing. The two equations can be combined together summarizing the algorithm, as shown in Equation (8), where the P(x) is omitted as it is the same for all the class.

Support Vector Machines
Support Vector Machines (SVM) are a set of supervised learning methods proposed by Vapnik in 1995 that work by minimizing the classification error to maximize the geometric margin between the classes [23]. The classifier works by finding the right hyperplane to separate the data points into required classes. Once the hyperplane is determined, the testing samples are predicted to be on either side of the plane. Mathematically, the hyperplane is given by Equation (9), where x is an N-dimensional input vector, w is a weight vector described as w = (w1, w2, w3...wn), and b represents the bias of the model and is a scalar quantity described as the perpendicular distance from hyper-plane to the origin.

Random Forest
Random Forest is a classification algorithm consisting of tree-structured classifiers (h(x, k and k = 1,2), where ( k ) is given as an independent identical distribution of random vectors [24]. Each tree casts a vote for the input x. Random Forest is based on an ensemble technique grouping classifiers such as decision trees and classifies the instances by summation of their individual votes. It is very popular among the classification algorithms due to its high performance.

Multi-Layer Perceptron (MLP)
Artificial Neural networks (ANN) are non-parametric flexible models comprised of several layers of computing elements called as nodes. The input signal is received by each node through external inputs and is processed locally through the transfer function. Transfer function outputs the transformed signal to other nodes. In MLP, all nodes and layers are arranged in a feed-forward manner [25].  Any input vector fed into the network is propagated from the first layer as an input layer, passing through the hidden layers, towards the last layer as the output layer. Three-layer MLP is a commonly used ANN structure for binary classification problems such as of olive tree detection. An example of an MLP with one hidden layer and one output node is shown in Figure3. The hyperplane is represented by a dashed line in Figure4.
Statistical features extracted from the segments are fed into the classifiers resulting in a binary classification map, indicating the classified olive trees along with non-olive objects. Classification accuracy is calculated, and correctly classified olive trees are recorded giving the total olive tree count.

Materials and Methods
This section describes the dataset used to evaluate our proposed method. It also discusses the parameters on which the performance of our method was measured and gauged.

Dataset
To evaluate the performance of our proposed algorithm, images were acquired from the SIGPAC viewer of the Ministry of Environment and Rural and Marine Affairs (http://sigpac.mapa.es/fega/ visor/). The interface spans the communities of Spanish territory captured in the form of satellite images. Among those communities is the Castilla La Mancha, covering the province of Toledo, which is significant for a high concentration of olive production [28].
Around 110 images in the visible spectrum with a spatial resolution of 1m and of uniform image sample size of 300×300 were taken from the satellite images. Parameters defining the center of the area by Universal Transverse Mercator (UTM) corresponded to Huso 30 with X = 411,943.23 and Y=4,406,332.6 [3]. Images taken from the viewer included multiple land covers including houses, roads, shrubs and bushes, rocks, and olive trees. For each image, the land covers were marked, providing the ground truth information necessary in the classification stage.

Performance Evaluation Metrics
Information about how well a classification system has performed in predicting the testing samples against their ground truth values was recorded in a confusion matrix or error matrix, as shown in Figure 5. True Positive (TP) and True Negative (TN) are the correctly classified test samples of positive class and negative class, respectively. Positive class in our system represents the olive trees, whereas negative class represents the non-olive objects. False Positive (FP) and False Negative (FN) are the mis-classifications of test samples of positive and negative classes, respectively, into other classes. Using the information from the confusion matrix, various performance evaluation metrics were calculated, which are briefly discussed below.

Overall Accuracy (OA)
It is represented as the ratio of the number of correctly predicted items to a total number of items to predict. In our binary classification problem, it is calculated as the ratio of correctly predicted olive and non-olive tree samples to the total number of samples. Mathematically, it is given as Equation (10),

Commission Error (CE)
Commission error, also known as False Positive rate, is defined as the mis-classification of an object being labeled as a true class actually belonging to the false one. The rate at which a non-olive sample is classified as an olive one is given as in Equation (11)  Omission error, also known as False Negative rate, is defined as the mis-classification of an object being labeled to a false class belonging to the true one. It is the rate at which an olive tree sample is classified as a non-olive one. Omission error is given in Equation (12) as,

Estimation Error (EE)
It is given as the error in estimation of the number of samples within a given region relative to the actual number of samples within that region. It is the representation of error between the estimated number of trees and the actual number of trees to be estimated. Mathematically, it is given in Equation (13) as,

Finding the Optimal Value of K
The value of K in K-Means clustering specifies the number of groups to be formed, which can be determined by clustering itself. Our methodology utilizes the elbow method [29] to find the optimal value of K, which works by measuring the intracluster distances between cluster points and their centroids given by Sum of Squared Error (SSE), as in Equation (14), (14) where dist is the Euclidean distance between the cluster members x and cluster centroid C i . Moving from smaller to larger value of K, the SSE decreases, giving less variation in the intracluster distance. The point with an abrupt decrease in the SSE gives the value of K. Our method uses the value of K as 4.

Experimental Results
Step-by-step results are described, elaborated, and compared with the existing techniques below. The proposed methods were tested on a desktop computer with Core TM i7 7700HQ and 2.80 GHz.

Image Pre-Processing Results
As mentioned in Section 4.2, LoG was utilized for the pre-processing of images where input colored images were pre-processed to reduce any noise or errors obscuring the required information. Images were smoothened, normalizing any noise in the image followed by the sharpening through Laplacian filter. Some examples of pre-processed images are shown in Figure 6. . Image pre-processing results. The first column represents the input image and its zoomed in version, whereas the second column refers to the corresponding results after pre-processing.

Image Segmentation Results
The ROI in the images are the olive trees and were extracted out of the background information using K-Mean clustering, as discussed in Section 3.3. The average optimal value for number of clusters was obtained as 4. The value of 4 was determined based on the abrupt change in SSE for varying values of K. The segments formed as result of K-Mean clustering with a value of K as 4 are shown in Figure 6.
Segmentation results were validated by measuring the segmentation accuracy of the proposed K-Means clustering method and results were recorded. The segmentation accuracy of the method was determined by calculating the overlapping percentage of resultant image with the marked ground truth information. It was calculated through Jaccard Analysis, measuring the ratio of intersection between segmentation result A and marked ground data B to the union of both A and B. Mathematically, it is given in Equation (15) as,

Classification Results
Statistical features extracted from the segments of images including both olive and non-olive tree components were fed into the classifiers SVM, Naive Bayesian, Random Forest, and MLP.
Data samples denoted by D1, D2, and D3 represent the training to testing ratios of 50 to 50, 60 to 40, and 70 to 30, respectively, as shown in Figure 7. It was observed that by increasing the training ratio from 50% to 70%, classifiers showed an increase in the classification results. Naive Bayesian, SVM, Multi-Layer Perceptron, and Random Forest showed an overall accuracy of 79.5%, 90.1%, 92.9%, and 96.6%, respectively, over the 50 to 50 division ratio. Figure 7. Image Segmentation Results for K set as 4. The first row represents the first cluster; the second row is the second cluster; and the third and fourth rows correspond to the third and fourth clusters, respectively. The scale bar represents the scaling ratio of 1 cm to 400 m.
Increasing the training samples to 60% of the total data samples, the overall accuracy for the classifiers increased to 79.6%, 91.1%, 93.5%, and 97.3%, respectively. The ratio of 70% for training samples resulted in the highest overall accuracy, with Naive Bayesian at 80.1%, SVM at 92.3%, MLP at 94%, and Random Forest performing best at 97.52%. Confusion matrices for classifiers trained and tested at D3 proportion are shown in Table 2, whereas comparison of classifiers for D3 division (70% training and 30% testing) is shown in Table 3. Table 2. Confusion matrices of the Olive Tree classification by four classifiers for D3 Division (table constructed as shown in Figure 5).

* SVM
Tree Non-Tree Naive Bayesian Tree Non-Tree

Discussion
This section discusses the overall results achieved by the proposed method and draws a comparative analysis of computational time with the utilization of both versions of K-Means segmentation.
It also draws a comparative analysis between the proposed method and existing techniques.

Comparative Analysis with Simple K-Mean
Standard K-Means technique initializes the centroids by random selection among the data points. The clustering is performed around the selected centers as they move towards the final position during the iterative process. The initial selection of centroids plays an important role in the overall performance of the clustering algorithm, affecting clustering speed along with the improvement in results. A comparative analysis between the two versions of K-Mean clustering is shown in Figure 9. The segmentation accuracy for Units 1-6 is shown in Figure 9, where each unit consists of almost 1000 foreground olive trees. It can be seen for Unit 6 both K-Mean variations performed equally well.
However, for images with low contrast between the background and foreground, centroid selected K-Means seemed to outperform the other, such as for Units 1-3 and 5. Centroid selected K-Means was able to detect young olive trees, outperforming the simple approach by a far margin for Unit 4.
Centroid selection speeds up the convergence of initial centroids to their final position, reducing the number of iterations by half that of standard K-Means. A reduced number of iterations results in reduced computational complexity.
The average computational time for an image using improved K-Mean comes out to be 78 ms, whereas the standard/simple K-Mean results in 131 ms of average computational time. A comparison of the computational time (CT) between the two variants of K-Mean is shown in Figure 10.

Comparative Analysis with Benchmark Schemes
Our proposed methodology and its results were compared with the existing techniques for olive detection and its enumeration. The comparison drew on the parameters of the dataset, the number of images used, the spectrum showing the processed information size, and the performance evaluation metrics. A tabular form of the analysis is given in Table 4.
For the spectral information of the imagery utilized, Ionis et al.'s ACE method [6] combining the red band and NDVI-based thresholding followed by the blob detection showed an estimation error of 1.24%. Chemin et al.'s [4] thresholding-based method for the binarization of the multi-spectral image followed by the localization of centers to detect the possible olive trees showed an estimation error of almost 13%. Jan Peters et al. [10] work based on developing synergy models combining sensor data followed by four-step classification algorithm showed an overall accuracy of 84.3%. The above-mentioned techniques showed promising results, achieving high accuracies; however, all the techniques utilized multi-spectral information, leading to the added computational cost.
Some techniques utilize only the color band spectral information or less. J. Gonzales' reticular matching technique identified olive trees within a particular area [7] using grayscale images. The probabilistic approach to detect olive trees combining the probability of a tree in a reticle along with its probability of being an olive tree showed promising results with an overall accuracy of 98%. In [9], Yakoub et al proposed classifying morphological features of ground objects from satellite imagery using GPC, detecting about 96% of the olive trees. Juan et al. proposed method of K-Means clustering [3] over SIGPAC viewer imagery, identifying almost all olive trees with a commission rate of zero and an omission rate of one in six of the test cases.
Juan et al. utilized the same imagery over the application of fuzzy logic [8] and showed quite similar results as per K-Means technique. The above-mentioned techniques showed accurate results with less processing information, however lacked diversity in terms of the number of images along with the ground classes. Karantzalos et al.'s [5] method of detection of olive trees consisted of pre-processing the input images acquired from QUICKBIRD satellite followed by detecting the local maxima of Laplacian as olive trees. Their study provided no such statistical data to measure the performance of the algorithm.
Our five-step classification model was tested over varying ratios of dataset distribution and resulted in high accuracy, ensuring the accuracy and robustness of our algorithm. Our proposed method addressed the shortcomings in existing techniques by accurately identifying olive trees, leading to a reliable tree count. This novel approach of five-step classification in olive tree detection validated over images having diverse ground data over color spectrum with an overall accuracy of 97.5% significantly contributes to the existing literature.

Conclusions
In this paper, we propose an automated method for the detection and enumeration of olive trees comprising of a multi-step classification-based implementation model. The model took an RGB image acquired from the SIGPAC Viewer as an input. The input image went through a stage of pre-processing that removed any additive noise followed by segmentation using the improved K-means clustering algorithm. Statistical features were extracted over the connected pixels in each of the segments and were classified using SVM, MLP, Random Forest, and Naive Bayesian. Among the classifiers, Random Forest outperformed the rest with an overall accuracy of 97.5% at 70 to 30 ratio of training to testing. Our technique with such accuracy, diversity, and enough samples for the proper training of the classifiers outperformed the previous techniques. It overcame the limitations of false tree count along with computationally expensive based accurate systems, devising a computationally efficient, accurate, and robust olive tree detection algorithm. As future work, we will incorporate more features to improve the feature extraction method as well as the classification. We look forward to the detection of olive trees using deep learning.