Feature Extraction for Cocoa Bean Digital Image Classiﬁcation Prediction for Smart Farming Application

: The implementation of Industry 4.0 emphasizes the capability and competitiveness in agriculture application, which is the essential framework of a country’s economy that procures raw materials and resources. Human workers currently employ the traditional assessment method and classiﬁcation of cocoa beans, which requires a signiﬁcant amount of time. Advanced agricultural development and procedural operations di ﬀ er signiﬁcantly from those of several decades earlier, principally because of technological developments, including sensors, devices, appliances, and information technology. Artiﬁcial intelligence, as one of the foremost techniques that revitalized the implementation of Industry 4.0, has extraordinary potential and prospective applications. This study demonstrated a methodology for textural feature analysis on digital images of cocoa beans. The co-occurrence matrix features of the gray level co-occurrence matrix (GLCM) were compared with the convolutional neural network (CNN) method for the feature extraction method. In addition, we applied several classiﬁers for conclusive assessment and classiﬁcation to obtain an accuracy performance analysis. Our results showed that using the GLCM texture feature extraction can contribute more reliable results than using CNN feature extraction from the ﬁnal classiﬁcation. Our method was implemented through on-site preprocessing within a low-performance computational device. It also helped to foster the use of modern Internet of Things (IoT) technologies among farmers and to increase the security of the food supply chain as a whole.


Introduction
Information technology has indeed shifted very significantly in human life. It is undeniable that technology currently represents an essential role in the development process from time to time. We are entering the Industrial Revolution Era 4.0, where Internet of Things (IoT) technologies are very influential in everyday life. Even in the area of agriculture, such technologies [1,2] have many important roles. Feature extraction is an artificial intelligence (AI) method that selects or consolidates numerous variables as a feature, which can effectively decrease the substance of data processed while still representing the fundamental dataset. The primary feature extraction for texture analysis, developed around the 1970s, employs co-occurrence matrix features introduced by [3].
Utilization of the feature extraction [4] method has been a common practice for image classification since the development of machine learning schemes. Combining several feature extraction procedures This study investigated a systematic approach to detect classification types of cocoa beans based on industry standards for processing cocoa beans. This approach utilizes digital images of cocoa beans to expedite their assessment and classification, which currently requires manual assessment and classification by humans (Figure 1a). We designed a smart farming framework concept scheme (Figure 1b) to explain our requirements and the use of our research methodology to address numerous references. Located in remote area, this framework utilizes the camera as a sensor node to capture cocoa bean images to be analyzed. Next, for local processing, employing a low-power computational device, integrated and connected to the low-bandwidth long-range communication network, the classification process can be accomplished. The data are later transmitted to cloud storage to be further analyzed in real-time by the processing factory.
The implementation expectations of this proposed approach a decrease in the processing time for assessment and classification analysis, which results in a reduction in time for the further This study investigated a systematic approach to detect classification types of cocoa beans based on industry standards for processing cocoa beans. This approach utilizes digital images of cocoa beans to expedite their assessment and classification, which currently requires manual assessment and classification by humans (Figure 1a).
We designed a smart farming framework concept scheme (Figure 1b) to explain our requirements and the use of our research methodology to address numerous references. Located in remote area, this framework utilizes the camera as a sensor node to capture cocoa bean images to be analyzed. Next, for local processing, employing a low-power computational device, integrated and connected to the low-bandwidth long-range communication network, the classification process can be accomplished. The data are later transmitted to cloud storage to be further analyzed in real-time by the processing factory. The implementation expectations of this proposed approach a decrease in the processing time for assessment and classification analysis, which results in a reduction in time for the further processing of cocoa beans. Our proposal consists of utilizing the co-occurrence matrix features of the GLCM and CNN for feature extraction.

Development of a New Methodology for the Textural Feature Analysis of Digital Images of Cocoa Beans
Image feature extraction is one of the steps of extracting data from objects in an image to distinguish them from other objects. The extracted features are then used as parameter values to distinguish between objects as a part of the classification stages.
The GLCM is an example of the basic and most traditional method of feature extraction and employs first-or second-order statistical characteristics. Toward the evolution of machine learning, CNN also utilizes the feature extraction process.
There are several steps for classifying cocoa beans: the first is image capturing, that is, obtaining images of cocoa beans and storing them in electronic file form. The second is preprocessing, which involves eliminating unnecessary parts of an image. Next is the feature extraction; within this study, we adopted the GLCM and CNN methods. From numerous steps of the classification process, the final result is the classification of the cocoa beans.
The classification of cocoa beans is used as the basis for processing cocoa beans to achieve the best results in the finished products. The final result of the classification process of cocoa beans can later be stored in the cloud, as cloud storage can store or archive data easily, so as to prevent data from being damaged or lost when stored on local storage media, such as hard disks or flash drives.

Materials
The digital images of the cocoa beans for the study were procured from South Sulawesi, Indonesia. The sampling method was based on [35,36]. These cocoa bean samples were classified as the following: (1) whole beans, cocoa beans with a whole seed coat covering all of the seed parts and not showing any fracture (Figure 2a); (2) broken beans, a cocoa bean with a missing portion that is half (1/2) or less than the whole cocoa bean (Figure 2b); (3) beans fractions, a cocoa bean fraction that is less than half (1/2) of the whole cocoa bean ( Figure 2c); (4) skin-damaged beans, a cocoa bean with a missing bean shell that is half or less in size than the whole cocoa bean ( Figure 2d); (5) fermented beans, a cocoa bean, which is the final product of the curing process and that is washed or unwashed and dried (Figure 2e); (6) unfermented beans, half or more of the sliced grayish chips' surface is visible in cocoa beans (e.g., slate or solid grayish blue in color and texture), while the surface is dirty white ( Figure 2f); (7) moldy beans, a cocoa bean with mold on the inside, and when the cocoa beans are split open, the fungus can be seen with the eye (Figure 2g). The cocoa bean sample classification utilized the Indonesian Standardization Institution's base classification as a regulation reference to export Indonesian cocoa beans [37].
These cocoa bean digital images were collected at the factory, and the final goal was to help reduce the classification process at the factory site. The cocoa bean image acquisition was achieved using a compact digital camera, as depicted in Figure 3.
The digital image dataset of the cocoa beans contained seven classes of cocoa bean classifications. Among all of the 7428 images, 1187 were listed as whole beans, 1046 as broken beans, 426 as bean fractions, 822 as skin-damaged beans, 916 as fermented beans, 1776 as unfermented beans, and 1255 as moldy beans (see Figure 4).
We split the dataset into 75% for training, 10% for testing, and 15% for validation to conduct the research and analysis. The small visual appearance of the cocoa beans and the rather large number of relevant classes made it necessary to use rather large shares of the material for training.
( Figure 2e); (6) unfermented beans, half or more of the sliced grayish chips' surface is visible in cocoa beans (e.g., slate or solid grayish blue in color and texture), while the surface is dirty white ( Figure  2f); (7) moldy beans, a cocoa bean with mold on the inside, and when the cocoa beans are split open, the fungus can be seen with the eye (Figure 2g). The cocoa bean sample classification utilized the Indonesian Standardization Institution's base classification as a regulation reference to export Indonesian cocoa beans [37]. These cocoa bean digital images were collected at the factory, and the final goal was to help reduce the classification process at the factory site. The cocoa bean image acquisition was achieved using a compact digital camera, as depicted in Figure 3.   These cocoa bean digital images were collected at the factory, and the final goal was to help reduce the classification process at the factory site. The cocoa bean image acquisition was achieved using a compact digital camera, as depicted in Figure 3.  Among all of the 7428 images, 1187 were listed as whole beans, 1046 as broken beans, 426 as bean fractions, 822 as skin-damaged beans, 916 as fermented beans, 1776 as unfermented beans, and 1255 as moldy beans (see Figure 4).
We split the dataset into 75% for training, 10% for testing, and 15% for validation to conduct the research and analysis. The small visual appearance of the cocoa beans and the rather large number of relevant classes made it necessary to use rather large shares of the material for training.

Methodology
The flowchart of the proposed methodology, as shown in Figure 5, consists of several steps. The remainder of this paper is as follows: the first part is data retrieval; the second part is data preprocessing; the next part is feature extraction; and the last part is the classification and summary of the results.

Experimental Environment
All experiments for this study were performed on a Dual-Core Intel Core i5 laptop @ 2.7 GHz with 8 GB 1867 MHz DDR3 of RAM. The languages utilized to implement the experimental study for the proposed approach were MATLAB R2019a, Python Programming Language, and Waikato environment for knowledge analysis (WEKA) software [38].

Methodology
The flowchart of the proposed methodology, as shown in Figure 5, consists of several steps. 822 as skin-damaged beans, 916 as fermented beans, 1776 as unfermented beans, and 1255 as moldy beans (see Figure 4).
We split the dataset into 75% for training, 10% for testing, and 15% for validation to conduct the research and analysis. The small visual appearance of the cocoa beans and the rather large number of relevant classes made it necessary to use rather large shares of the material for training.

Methodology
The flowchart of the proposed methodology, as shown in Figure 5, consists of several steps. The remainder of this paper is as follows: the first part is data retrieval; the second part is data preprocessing; the next part is feature extraction; and the last part is the classification and summary of the results.

Experimental Environment
All experiments for this study were performed on a Dual-Core Intel Core i5 laptop @ 2.7 GHz with 8 GB 1867 MHz DDR3 of RAM. The languages utilized to implement the experimental study for the proposed approach were MATLAB R2019a, Python Programming Language, and Waikato environment for knowledge analysis (WEKA) software [38]. The remainder of this paper is as follows: the first part is data retrieval; the second part is data preprocessing; the next part is feature extraction; and the last part is the classification and summary of the results.

Experimental Environment
All experiments for this study were performed on a Dual-Core Intel Core i5 laptop @ 2.7 GHz with 8 GB 1867 MHz DDR3 of RAM. The languages utilized to implement the experimental study for the proposed approach were MATLAB R2019a, Python Programming Language, and Waikato environment for knowledge analysis (WEKA) software [38].

Image Data Preprocessing
The preprocessing of digital images represents a start to ensure smoothness and success in the subsequent digital image processing steps. This method includes enhancing the image quality (contrast, brightness, etc.), noise reduction, image adjustment or reconstruction, image transformation, and determination of the part of the image to be evaluated. Recent research has shown that the image Agronomy 2020, 10, 1642 7 of 16 classification method might properly label images when removing background objects (i.e., only based on background content) [39,40].
By experimental evidence, visual hints can be explicitly learned independently and then combined to achieve higher performance, which validates the proposed framework's advantages. The preprocessing of cocoa bean digital images is performed by applying simple object detection (segmentation and feature extraction), measurement, and filtering. First, all of the objects are obtained, and then the results are filtered to separate targets of distinct sizes.
Feature Extraction: Convolutional Neural Network (CNN) We used cocoa bean digital image data; in the first step, the cocoa bean images were labeled according to [35][36][37], saved to our local storage, and loaded into an array based on each class. This full dataset array was split into 75% for training, 10% for testing, and 15% for validation, and then a CNN was built using the architecture layer in Table 1. In our CNN architecture for feature extraction, we used several CNN layers. First is the convolutional layer, which generates a feature map to predict the class probabilities for a particular feature by applying a filter that examines the complete image several pixels at a time.
The second layer is the pooling layer (down-sampling); the down-sampling of the pooling layer process consists of decreasing the image size while preserving its essential characteristics. This layer reduces the number of parameters and computations in the network, while the process in this layer enhances the performance of the network and avoids over-learning.
The next layer is the flattening layer, in which flattening the outputs generated by the previous layers are transformed into a single vector as an input for the next layer. Lastly, the final layer is the fully connected layer. In this layer, the process scales down the amount of information and maintains the essential information (the convolutional and pooling layers are usually repeated several times). We trained our model by applying dataset augmentation using the parameters in Table 2, and using epochs of 5, 10, 15, 20, and 25, we extracted features using an intermediate model, resulting in the fully connected layer being the "output" layer of the CNN. We saved this feature extraction data to become the input for the classification process. Details of the CNN feature extraction results can be seen in Figures 6-10 and Table 3. the accuracy and loss for the feature extraction method applied to the cocoa bean dataset. Training loss is the summation of an error made on the training dataset, and it also implies the model performance behavior after each iteration. Meanwhile, validation loss is a result error after running the validation dataset through the previously trained network.
(a) (b)  performance behavior after each iteration. Meanwhile, validation loss is a result error after running the validation dataset through the previously trained network.
(a) (b)          Image data augmentation is a technique that implements artificial enlarging of the size of a training dataset by creating a modified version of that dataset's image. This process can improve the model's performance and ability to generalize. A detailed description of the parameters for our image data augmentation process is shown in Table 2.
An epoch is a series of steps for learning in a neural network. An example of an epoch is when the entire dataset has gone through the neural network training process once until it is returned to the beginning for the next round [41]. For our method, we used an epoch of 5, 10, 15, 20, and 25 on our feature extraction experimental parameters for the CNN feature extraction. Figures 6-10 show the accuracy and loss for the feature extraction method applied to the cocoa bean dataset. Training loss is the summation of an error made on the training dataset, and it also implies the model performance behavior after each iteration. Meanwhile, validation loss is a result error after running the validation dataset through the previously trained network. Table 3, as the epoch number is increased, the accuracy evaluation model increases, while the loss value decreases.

Feature Extraction: Gray Level Co-Occurrence Matrix (GLCM) Textural Features
The GLCM method is essential in the statistical approach for extracting texture features. The mathematical calculation uses a gray degree distribution (histogram) by measuring the degree of contrast, granularity, and roughness of an area of the relationship between neighbor's pixels in the image. In the first step of the GLCM calculation, the average values of the red, green, and blue pixels are taken to return the grayscale value so as to achieve reasonable gray approximation. Then, the weighted sum of the red, green, and blue (RGB) channel is used with the BT.601 standard [42], with the coefficients R = 0.30, G = 0.59, and B = 0.11. The second step is calculating the co-occurrence matrix with a maximum value of the grayscale pixels. After calculating the co-occurrence matrix, the symmetrical matrix is created by copying a new transposed copy of the co-occurrence matrix. This copy of the matrix is added to the co-occurrence matrix, which produces a symmetrical matrix. The next step is normalizing the symmetrical GLCM by dividing each symmetrical GLCM by the sum of all elements. The final step is computing the textural features from the normalized GLCM matrix [43].
The textural features extracted from each of the GLCM features used in this research are as follows: Inverse Di f f erent Moment = i j

Sum o f Squares Variance
Notation: (i, j) GLCM coordinates, each ranging 0 to N g − 1. p(i, j) (i, j)th entry in a normalized gray-tone spatial dependence matrix. N g Number of distinct gray levels in the quantized image.

(x, y)
Pictorial information is represented as a function of two variables. µ x , µ y First-order statistical moments of the quantized image. σ x σ y Second-order statistical moments of the quantized image.
A generated GLCM feature matrix can represent a picture with fewer parameters [3,13] using the GLCM textural feature properties. When applying this method, as represented in Figure 11, we used image datasets and applied GLCM texture extraction to the dataset. For the computation of the co-occurrence matrix implemented in this study, the distance = 1 and the angles were 0 • , −180 • , −90 • , 90 • , and 180 • . Table 4 provides an example of the extracted features with distance = 1 and horizontal angle = 0 • . Agronomy 2020, 10, x FOR PEER REVIEW 11 of 16 Figure 11. Gray level co-occurrence matrix (GLCM) feature extraction. RGB-red, green, and blue.
Seven features, i.e., contrast, dissimilarity, inverse difference moment, angular second moment, energy, correlation, and sum of squares variance, were extracted, as depicted in Table 4. Seven classes of cocoa bean classifications based on SNI 2328:2008 [37], i.e., whole beans, broken beans, bean fractions, skin-damaged beans, fermented beans, unfermented beans, and moldy beans, were extracted from the cocoa bean digital images.

Classification Prediction
For the next step of our method, based on Figure 2, we implemented the classification process using our dataset. As previously mentioned, we split our dataset into 75% for training, 10% for testing, and 15% for validation. We used the testing (10%) and validation (15%) data and then implemented the classification process to a model that had been previously trained.
This implementation was carried out using: 1. The feature extraction model using CNN; 2. The feature extraction model using GLCM.
In addition to these two models, we also carried out a classification process using WEKA to compare the classification results. WEKA is one of the open source tools for training and testing deep learning data without programming code. We ran an experiment using the following steps: loading cocoa bean digital image data that were previously preprocessed using GLCM textural features in the attribute-relation file format (ARFF) (Figure 12), and then running the experiment based on the flow shown in Figure 13 to retrieve the classification results.  Seven features, i.e., contrast, dissimilarity, inverse difference moment, angular second moment, energy, correlation, and sum of squares variance, were extracted, as depicted in Table 4. Seven classes of cocoa bean classifications based on SNI 2328:2008 [37], i.e., whole beans, broken beans, bean fractions, skin-damaged beans, fermented beans, unfermented beans, and moldy beans, were extracted from the cocoa bean digital images.

Classification Prediction
For the next step of our method, based on Figure 2, we implemented the classification process using our dataset. As previously mentioned, we split our dataset into 75% for training, 10% for testing, and 15% for validation. We used the testing (10%) and validation (15%) data and then implemented the classification process to a model that had been previously trained.
This implementation was carried out using: 1. The feature extraction model using CNN; 2.
The feature extraction model using GLCM.
In addition to these two models, we also carried out a classification process using WEKA to compare the classification results. WEKA is one of the open source tools for training and testing deep learning data without programming code. We ran an experiment using the following steps: loading cocoa bean digital image data that were previously preprocessed using GLCM textural features in the attribute-relation file format (ARFF) (Figure 12), and then running the experiment based on the flow shown in Figure 13 to retrieve the classification results. In the data visualization shown in Figure 12, seven extracted GLCM feature values are displayed in graphical form, and the distribution of values for each feature is in the 0-1 value range.
In addition to these two models, we also carried out a classification process using WEKA to compare the classification results, as shown in Figure 13.

Results
To analyze the performance of the two feature extraction methods, we applied two types of classifiers-SVM ("linear" kernel) and extreme gradient boosting (XGBoost)-and compared the results to that from utilizing the WEKA tools (SVM and AdaBoost), as shown in Table 5.   In the data visualization shown in Figure 12, seven extracted GLCM feature values are displayed in graphical form, and the distribution of values for each feature is in the 0-1 value range.
In addition to these two models, we also carried out a classification process using WEKA to compare the classification results, as shown in Figure 13.

Results
To analyze the performance of the two feature extraction methods, we applied two types of classifiers-SVM ("linear" kernel) and extreme gradient boosting (XGBoost)-and compared the results to that from utilizing the WEKA tools (SVM and AdaBoost), as shown in Table 5.  In the data visualization shown in Figure 12, seven extracted GLCM feature values are displayed in graphical form, and the distribution of values for each feature is in the 0-1 value range.
In addition to these two models, we also carried out a classification process using WEKA to compare the classification results, as shown in Figure 13.

Results
To analyze the performance of the two feature extraction methods, we applied two types of classifiers-SVM ("linear" kernel) and extreme gradient boosting (XGBoost)-and compared the results to that from utilizing the WEKA tools (SVM and AdaBoost), as shown in Table 5.
From Table 5, we can summarize our research results as follows: · A large number of epochs have an immense effect on the accuracy of the identification of the image being tested. The larger the number of epochs, the more accurate the identification of the image. As presented in  Our results show that using the GLCM textural features method for feature extraction can achieve better and promising classification accuracy in comparison to CNN feature extraction. As our study was concentrated on the used/proposed methodology of feature extraction, we can conclude that our results are acceptable.

Discussion
Our proposed method, when utilizing CNN feature extraction, achieved an accuracy of 59.14% and 56.99% with the SVM and XGBoost classifiers, 61.04% and 65.08%, respectively, when using GLCM textural features.
The highest rate achieved by [22] with the SVM classifier resulted in a discrimination rate of 100% for the prediction and training sets. The authors in [25] proposed six CNN layers for dataset 1 and fine-tuned the pre-trained model, resulting in an accuracy of 99.49% and 99.75% for the first model in dataset 1, and 85.43% and 96.75% for the second dataset. In [26], the authors used SVM and the genetic algorithm with the features of shape, texture, and color, thereby achieving classification rates of 96-98% for an apple, 95-97% for grapes, and 96-97% for bananas. From the four-class classification conducted by [23], classification rates of 84% for whole beans, 52% for broken beans, 20% for bean fractions, and 20% for skin-damaged beans were achieved due to the non-uniform shape of the cocoa beans. The authors in [27] used expensive sensor equipment such as liquid and gas chromatography and the machine learning method for classification to investigate the fermentation rate of cocoa beans, resulting in the following misclassification rates: bootstrap forest, 9.40%; ANN, 12.80%; boosted tree, 13.60%.
It appears that compared to other farming products, the classification of cocoa beans is still a challenging problem; however, our results provide a more homogeneous classification rate for all classes.

Conclusions
We demonstrated our method by implementing a textural feature extraction of cocoa beans, digital images, and classifications, consisting of seven classes of beans. By utilizing the co-occurrence matrix feature of GLCM, we extracted seven features.
In our method, the CNN feature design repeatedly applies and collects all features found from several runs with a weighted sum of RGB and data augmentation with epochs of 5, 10, 15, 20, and 25. It can be used for representation purposes of the dataset features, and increasing the number of epochs can increase the final recognition rates.
From our results, we can conclude that our method, utilizing the GLCM texture features method as one part of the feature extraction process from cocoa bean digital images, can achieve promising results