A Real-Time Nut-Type Classifier Application Using Transfer Learning

Özçevik, Yusuf

doi:10.3390/app132111644

Open AccessArticle

A Real-Time Nut-Type Classifier Application Using Transfer Learning

by

Yusuf Özçevik

Department of Software Engineering, Manisa Celal Bayar University, Manisa 45400, Turkey

Appl. Sci. 2023, 13(21), 11644; https://doi.org/10.3390/app132111644

Submission received: 24 September 2023 / Revised: 23 October 2023 / Accepted: 24 October 2023 / Published: 24 October 2023

(This article belongs to the Special Issue Applications of Deep Learning and Artificial Intelligence Methods)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Smart environments need artificial intelligence (AI) at the moment and will likely utilize AI in the foreseeable future. Shopping has recently been seen as an environment needing to be digitized, especially for payment processes of both packaged and unpackaged products. In particular, for unpackaged nuts, machine learning models are applied to newly collected dataset to identify the type. Furthermore, transfer learning (TL) has been identified as a promising method to diminish the time and effort for obtaining learning models for different classification problems. There are common TL architectures that can be used to transfer learned knowledge between different problem domains. In this study, TL architectures including ResNet, EfficientNet, Inception, and MobileNet were used to obtain a practical nut-type identifier application to satisfy the challenges of implementing a classifier for unpackaged products. In addition to the TL models, we trained a convolutional neural network (CNN) model on a dataset including 1250 images of 5 different nut types prepared from online-available and manually captured images. The models are evaluated according to a set of parameters including validation loss, validation accuracy, and F1-score. According to the evaluation results, TL models show a promising performance with 96% validation accuracy.

Keywords:

nut classification; artificial intelligence; transfer learning; convolutional neural network

1. Introduction

Artificial intelligence (AI) has helped to construct smart environments and changed daily habits through recent widespread usage. In [1], it is stated that we are at the beginning of a technological revolution based on the paradigms of Industry 4.0 and Society 5.0. Thus, it is important to exploit the recent innovations in digital technologies led by AI. Civilization should opt into this progress by utilizing smart solutions in different aspects of daily life as early as possible to reap their benefits in the long run [2]. Recently, the current potential of AI has been noticed by many countries and industrial companies. As a result of this awareness, products are being created in many different fields, including health, economy, agriculture, defense industry, and sports, using a set of tools consisting of AI, machine learning (ML), and image processing [3]. These studies are conducted to facilitate human life, increase its quality, and accelerate business processes.

There are also AI applications that are utilized in the daily routines of society, where an application can easily be used by an ordinary end-user. Shopping is one of the most important parts of the daily routine, and the shopping process has significantly changed in recent years. The shopping that used to be undertaken in smaller and traditional stores in the past has now been replaced by market chains and shopping centers. This transition has led to a new era of smart shopping. Kiosk hardware, internet of things (IoT) devices, near-field communication (NFC) technology, etc., are contributing to the transition to these new shopping habits. AI is also contributing to this new way of shopping by participating in the development of consumer-oriented smart shopping applications. For example, thanks to the walk out technology used in Amazon’s revolutionary Amazon Go markets [4], shopping can be undertaken without queuing and human interaction. In such applications, the data of the AI solution are processed by one of various methods, such as image processing, data mining, IoT, NFC, radio frequency identification (RFID), etc. Various image-processing technologies, such as barcode and character recognition, have already been used for packaged products in market chains and shopping centers. As a result, vendors have started to set up smart payment terminals to facilitate the payment process of the customers [5]. It can be said that the recent advances in IoT technologies make it possible to conveniently sell packaged products. On the other hand, unpackaged products, such as nuts, fruits, vegetables, etc., are still widely consumed in low- and middle-income countries due to their lower prices, availability, and affordability [6,7]. Therefore, there is a need for appropriate applications to digitize the purchase of unpackaged products. However, it is difficult to recognize such products using advanced AI techniques due to environment-specific challenges and the lack of barcode, RFID, and NFC components.

There are several attempts in the literature to alleviate the purchase of unpackaged products. Ref. [8] conducts a study of the classification of 15 different fruit and vegetable classes for usage in supermarkets. The authors focus on the ambient lighting conditions of the environment and the color variations in different fruits and vegetables. They propose to use a deep convolutional neural network (DCNN) to reduce the classification errors and find that the experiments with the trained models give a prediction accuracy of up to 99%. Ref. [9] proposes to use deep learning (DL) to classify six different fruits in a supply chain and merchandising context. In their study, the authors propose to utilize fusion-based feature extraction from fruit images and common DL architectures for model training. Consequently, the authors reported that they achieved a success ratio of 97% to recognize different fruits. Ref. [10] develops a mobile application that analyzes the nutritional value of a food item captured via a camera to help people to live a healthy life. The authors state that their framework can achieve 85% accuracy for food recognition. Ref. [11] aims to recognize Korean foods offered in restaurants and stores. The authors propose a custom learning model based on CNN and state that the model achieves 91.3% accuracy for food prediction. It can be argued that the classification of ready-to-eat food or unpackaged products of larger sizes becomes more straightforward using basic image processing techniques, as applied in [12,13]. On the other hand, smaller products such as nuts are somewhat more difficult to classify. In addition, there might be some customer-related issues when paying for unpackaged products. For example, a customer may inadvertently select the wrong product to pay for (or maliciously quote a lower price for another product) during the payment process at existing retail payment terminals [14]. Therefore, a smart solution is needed to digitize the process of purchasing unpackaged products.

Nuts are a product used in many industries, as well as being widely consumed as unpackaged products sold in nut stores and shopping centers [15,16,17]. The digitalization of society has also generated AI solutions for classifying different nut types for daily routines. There have been attempts in the literature to classify different nuts or determine the species and maturity level of a specific kind of nut by utilizing image-processing techniques and AI models. Ref. [18] classifies the cashew species into five categories based on adulteration. The authors propose to use DCNN and utilize a transfer learning (TL) architecture for model training. Eventually, they state that an accuracy of 98% is achieved in classifying first-class cashews. Ref. [19] classifies different species of pine nuts by utilizing spectroscopy and image analysis for a dataset of pine nuts. The authors state that an accuracy of 95% is achieved when NIR spectroscopy is used. In [20], the authors classify 11 different types of nuts and obtain 97% accuracy using the ResNet architecture. Moreover, they achieve an accuracy rate of 97% using ResNet-50 and 98% using DesNet-201. The study can be criticized for only representing theoretical results rather than a practical implementation.

Even though the studies in the literature produce satisfactory results with high accuracy values for different unpackaged products marketed in the studied shops, most of them are not validated through real-time application by end-users. In a real-world solution, a practical application should continuously capture images from a video stream and classify the content according to the learning model. Moreover, the classification effort is intensive in the existing studies, which try to provide proper image processing techniques and convenient learning models. On the other hand, the TL can be consulted to reduce the effort for both owing to its capability to exploit existing knowledge. More specifically, there are known TL architectures including thousands of image samples and an underlying learning architecture. Hence, considering the aforementioned motivations, in this study, a nut-type classifier based on the TL is proposed to provide a practical smart application for nut sellers. The main contributions presented in the study are stated as follows:

A dataset of 1250 images of 5 different nut types is constructed by providing nut images from both online sources and manually taken pictures.
The TL is applied to the nut-type classification problem, and four different TL architectures are consulted to train classifier models on the dataset. Moreover, a custom CNN model is trained to compare the success of TL models.
A practical application developed using the Python language is presented to detect and recognize nuts in real time.

The rest of the paper is organized as follows: In Section 2, the materials and methods used in the study, and the components of the performance evaluation are presented. Subsequently, in Section 3, the evaluation results are presented and discussed. Finally, in Section 4, the paper concludes by discussing the future directions.

2. Materials and Methods

In this study, a dataset containing five different types of nuts was prepared, four different transfer learning architectures plus a custom CNN model were utilized for training, and a Python application was developed for practical real-time classification.

2.1. Data Acquisition

The data used in this study consist of both manually captured images and images obtained from the internet. In this study, five different types of nuts were considered, including pistachio, peanut, pumpkin seed, sunflower seed, and roasted chickpea. Since the dataset contained manually captured images, it was found that the learning model is more suitable to obtain a practical real-time software application. The samples of the dataset are illustrated in Figure 1. The number of samples for each type of nuts and the distribution of the samples with respect to those manually captured or obtained from the internet are also given in Table 1.

We planned to acquire a balanced dataset, since it is important for a successful model training. Therefore, each of the samples of different type of nuts was rounded to 250 by adding images from the internet to the ones manually captured in a nut shop. As a consequence, a total of 1250 images are obtained for the dataset, of which about 85% are manually acquired.

Data augmentation was also applied to the image dataset constructed. For each of the images, 3 major operations were repeated before the model training stage. These operations were rescaling, rotation, and flipping. Thus, a total of 3 × 1250 data are trained in the learning processes of each model.

2.2. Transfer Learning

Data collection is an effortful process and incorporates many challenges [21]. For this reason, high-performance models trained on existing datasets were required so as not to reverse the cumulative knowledge for a new application. The TL was utilized to ensure the success of models on datasets consisting of fewer instances [22]. In other words, there are known TL architectures that contain pre-trained models, including beneficial knowledge to be used for a new problem. Figure 2 shows the working principle of the TL architecture used. As shown in the figure, one of the known TL architectures, namely the ImageNet including 20,000 class labels belonging to more than 14 million image data, was used as a pre-trained model. Based on the pre-trained model, we tried to solve a new problem via knowledge transfer for the classification of nuts. More specifically, in this study, a nut dataset consisting of 1250 images was used to predict 5 different types of nuts using the previous weights of the pre-trained models trained using the Inception, ImageNet, EfficientNet, and ResNet TL architectures.

2.2.1. Inception

The Inception V3 model has a general usage in a deep neural network consisting of 42 layers and is trained in TensorFlow. Its network architecture is used for problems similar to the one presented in this study. Inception V3 has a lot of Convolution and MaxPool layers. Training the Inception V3 model on computers with limited computational capability is difficult and takes a long time. For this reason, TensorFlow allows the last layer of the Inception model to be re-trained for the identified problem using the TL. No changes were made to the parameters of the previous layers. Also, the number of nodes in the last layer corresponded to the number of categories in the dataset. For example, the ImageNet dataset has 1000 categories and 1000 output nodes in the last layer [23]. Since there are 5 categories in the dataset of nuts prepared in this study, there are 5 output nodes in the last layer. The main difference between traditional models and Inception V3 is that the computational overhead is reduced using the pooling operation after the convolution operation [24].

2.2.2. MobileNet

MobileNet is an architecture that is heavily used for classification and image recognition tasks in mobile applications and embedded systems. The MobileNet architecture provides performant processing with separable convolutions. This provides a structure that is fast, small, and suitable for mobile devices [25]. MobileNet-V3-Large, as used in this study, has a smaller number of weights than the previous versions and is trained with the ImageNet dataset [26]. The model is based on deeply separable convolution, which is a type of factorized convolution. The architecture had 28 layers, apart from depth-separable and point-separable convolutions. Thanks to in-depth convolution, the number of parameters, throughput, and model complexity were reduced. Thus, the model became efficient in devices with limited computational capability [27]. The application-specific run-time environment of this study overlaps with the usage areas of MobileNet. Hence, we evaluated the performance of MobileNet-V3 through the experiments.

2.2.3. EfficientNet

EfficientNet is another TL architecture used in this study. The specific EfficientNet-V2-Bk-21k version is used in this study. EfficientNet is a CNN architecture and scaling method. The scale of EfficientNet was recently studied by researchers. It was found that the depth, width, and resolution of the network affect its performance. In the light of this finding, a new scaling method was identified that scales all dimensions of the depth, width, and resolution of the network equally [28]. Therefore, we consider applying and evaluating the EfficientNet architecture for TL-based nut-type recognition proposed in this study.

2.2.4. ResNet

CNN has ushered in a new era in solving the problem of object recognition and classification. It has become popular to look deeper to classify more complex problems or improve the accuracy. The structure included in the ResNet models depended on the CNN [29]. There are several variants of ResNet such as ResNet-18, ResNet-50, ResNet-101, etc. The numbers attached to the term ResNet denote the number of layers included in the CNN [30]. The more layers the underlying neural network has, the greater its depth. In this study, a 50-layer ResNet-50 architecture was used. ResNet-50-X1 version is the specific version used to apply TL. We include ResNet-50 in our study because it provides a commonly used, high-performance TL architecture.

2.3. Performance Evaluation

The evaluation environment for the performance investigation included a set of development tools and a desktop application developed in this study using Python programming language. The object-oriented application development approach was used during the development process. Five different learning models, including four TL models and a custom CNN model, were embedded into the application. The learning models evaluated in the study were developed using Jupyter Notebook and Spyder IDE via the Anaconda software. The TensorFlow, Pandas, NumPy, and Keras libraries were utilized to train the models. After training the models, the performance metrics were calculated and plotted using the Matplotlib library. The OpenCV library was also used to capture the real-time image data from the camera hardware. The desktop application consisted of the models with the optimal hyperparameters among the models trained on Jupyter Notebook. The interaction between the development tools included in the evaluation environment is shown in Figure 3.

The application was able to recognize and classify only one type of nut at a time. In other words, the classification of a mixture of different nuts was not considered in this study. An example of the desktop application execution can be found in Figure 4. The application window displays the streaming captured via the camera in real time. The OpenCV codes developed in the application recognized the nuts, and the learning models trained in the study recognized the type of nuts. The accuracy of the proposed recognition was also displayed at the top of the video streaming. For example, as shown in the figure, the application detected that the nuts were roasted chickpeas with 95.5% accuracy. The application was tested on a computer with a 64-bit Windows 10 operating system and a speed of 3.7 GHz CPU. For the detection of nuts, an embedded camera with the ability to record video in real time (1080p) was used. The application was ready to be deployed on a kiosk machine, a terminal hardware, or another type of smart devices that support Python run-time specifications. A customization of the source codes may also be performed to use the application in a nut shop.

The performance of different learning models was evaluated with respect to a set of parameters, including loss, accuracy, validation loss, validation accuracy, and F1-score. For this purpose, two different strategies were established to compute the performance metrics. Firstly, all data were used for both training and testing to calculate loss and accuracy. Secondly, 50 samples from each dataset were randomly selected to obtain a test set to evaluate the model trained with the remaining 200 samples and calculate the validation loss and validation accuracy. In addition, the F1-score values for the test data are calculated with respect to different learning models trained. The loss values given in [31] are calculated according to Equation (1), as given below

L_{C E} = - \sum_{i = 1}^{n} t_{i} \log {(p}_{i})

(1)

for n classes, where t_i is the truth label, and p_i is the maximum probability value for belonging to the ith class.

The accuracy parameters were calculated with respect to true-positive (TP), true-negative (TN), false-positive (FP) and false-negative (FN) predictions. The concrete expression is given in [31] and formulated by Equation (2) as follows:

A c c u r a c y = \frac{(T P + T N)}{T P + T N + F P + F N}

(2)

The F1-score parameter is calculated with respect to the precision value and recall value, as introduced in [31] and stated in Equation (3):

F 1 - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(3)

where the precision value is calculated as

\frac{\sum T P}{\sum T P + \sum F P}

, and the recall value is calculated as

\frac{\sum T P}{\sum T P + \sum F N}

, according to [31].

The models were trained by considering the hyperparameters given in Table 2. The input size of the images changed depending on the different TL architectures, as indicated in the table. For the custom CNN model trained in this study, the input size of images was determined as 300 × 300. We applied an early stop strategy for epochs to save time during the training process. For this purpose, the patience values were determined based on the validation loss parameter. When the validation loss values converged to a certain value in a series of consecutive iterations determined by the patience value, the learning process was stopped. In other words, the number of epochs was fixed at 50, which means that each model was trained for up to 50 epochs but could be terminated if the patience value was reached before that point. The sparse categorical cross entropy was used as a loss function for all learning models. The Adam optimizer was also used for the learning process of each model.

3. Results and Discussion

The evaluation results of each learning model investigated in this study are shown in Table 3. The rows of the table correspond to different models evaluated, and the columns indicate the evaluation criteria used in the evaluation of the models. In addition, the epoch value for each model that gives rise to the best results obtained is indicated in the rightmost column. The effect of early stop for model training is also emphasized by implying the last epoch value in which each model ran before terminating. When the table is examined, it is seen that the custom CNN model produces a higher loss, higher validation loss values, a lower accuracy, a lower validation accuracy, and lower F1-score values than the TL models. Hence, the TL can be regarded as a successful approach for model training on the newly created dataset including images of nuts. Among the TL approaches, ResNet-50 and EfficientNet-B3 outperform the others, according to the evaluation criteria. In particular, compared to EfficientNet-B3, ResNet-50 is able to produce the same validation accuracy with fewer iterations for epochs. Thus, it can be advocated that ResNet-50 requires a shorter execution time for model training. A common aspect of the evaluation results is that all of the models except Inception-V3 are trained before 50 epochs owing to the early stop strategy being applied. Another common aspect is the increasing and decreasing trends between loss–validation loss and accuracy–validation accuracy, respectively. The F1-score values for the TL models can also be regarded as acceptable. An F1-score is a measure of overall model performance for the learning models under investigation. It takes values in the range of [0, 1]. The higher values reflect the learning capability of the model training and correspond to a better performance of the model. More specifically, the F1-score can be regarded as the ability of a model to both capture correct samples (recall) and the accuracy of the samples the model captures (precision). The values above 0.8 can be accepted as successful in the existing literature.

The evaluation of the custom CNN model is shown in Figure 5. The figure shows the change in loss, accuracy, validation loss, and validation accuracy with respect to the increasing number of epochs. It is seen from the figure that the validation loss converges to a specific value when the number of epochs increases to 30. Meanwhile, the patience number is reached between epoch values 23 and 30. Therefore, the learning is stopped at epoch 30. The best value for the validation loss is obtained as 1.14 at epoch 23. The values for loss, accuracy, and validation accuracy correspond to 0.92, 0.62, and 0.54 at epoch 23, respectively. The validation accuracy value 0.54 can be interpreted to mean that the custom CNN model produces approximately the same number of correct recognitions and incorrect recognitions for the nut-type classification. Hence, its performance can be said to be far from the successful benchmarks, where accuracy of 90% or more has been achieved in similar studies in the literature. Therefore, it can be claimed that it is an effortful process to find suitable hyperparameters, datasets, etc., in order to achieve satisfactory performance results when custom CNN models are applied.

The evaluation of four different TL models with respect to the different number of epochs is shown in Figure 6. According to the evaluation results, it is clearly seen that all TL approaches produce lower loss values and lower validation loss values than the custom CNN model. They also produce higher accuracy values and higher validation accuracy values. In particular, Inception-V3 stops learning at epoch 50 and produces the best validation loss value of 0.38 at epoch 45. The values of loss, accuracy, and validation accuracy at epoch 45 are 0.08, 0.99, and 0.84, respectively. Thus, the apparent superiority of Inception-V3 over the custom CNN model is clear. The model obtained using MobileNet-V3 stops learning at epoch 40 owing to the early stop strategy applied. The lowest validation loss value is obtained as 0.27 at epoch 31, where loss, accuracy, and validation accuracy values correspond to 0.04, 0.99, and 0.90, respectively. When the validation accuracy is considered, it can be said that MobileNet-V3 is more successful for detecting the test data than not only the custom CNN model but also the Inception-V3 model. Another model trained using EfficientNet-B3 stops learning at epoch 39 and produces the lowest validation loss of 0.12 at epoch 30. The values for loss, accuracy, and validation accuracy are equal to 0.02, 0.99, and 0.96, respectively. Therefore, it can be said that EfficientNet-B3 shows a better performance than the custom CNN, Inception-V3, and MobileNet-V3 for detecting different type of nuts. Similarly, ResNet-50 stops learning at epoch 32 and produces the best validation loss value of 0.10 when the epoch number is equal to 23. The loss, accuracy, and validation accuracy correspond to 0.01, 0.99, and 0.96, respectively, when the number of epochs is 23. The ResNet-50 model produces the same accuracy and validation accuracy compared to the EfficientNet-B3. However, it has a lower validation loss value and learns the knowledge with a smaller number of epochs than the EfficientNet-B3. Therefore, ResNet-50 can be regarded as the most successful model among the TL models evaluated in this study.

The best validation lost and the best validation accuracy parameters are also compared with respect to different TL models in Figure 7. These evaluation metrics are important, as they show the performances of different models for detecting nut types in the test data. In addition, the corresponding F1-score values are calculated and shown in the figure. From the figure, it is clear that the custom CNN model gives very ineffective results compared to the TL models. Even though the performance of the TL models is close, EfficientNet-B3 and ResNet-50 are different from the others because they produce lower validation loss values, higher validation accuracy values, and higher F1-score values. Moreover, ResNet-50 can be regarded as the best TL model among the others, as it produces the lowest validation loss value and one of the highest validation accuracy values. The F1-score calculated for ResNet-50 is again the highest among the others. The F1-score values also validate the performance superiority of TL models with regard to the custom CNN model.

The evaluation results show that a practical application for classifying nuts can be implemented using appropriate AI models. Learning-based solutions have become prevalent in almost every area of daily life. Day by day, new requirements emerge from different problem domains. CNN and DL are often consulted to develop trustworthy applications. However, finding a successful solution can be time consuming and effortful due to the complex dynamics of such approaches. In particular, finding suitable hyperparameters may not be the best option to train a new learning model on another dataset with similar problem domains. Therefore, adaptive and intelligent strategies should be used to respond to the requirements. The TL can be proposed as a possible solution for training new learning models because it is time saving and effortless. The basic idea of TL is to use the hyperparameters of previous successful models and create new models based on the pre-trained ones. In this study, the feasibility of different TL approaches is interrogated using a practical application for classifying unpackaged nuts. A newly created dataset is used throughout the evaluations. A custom CNN model is also trained for classification. The results clearly show that the TL is a promising candidate for new problems that need to be solved using successful learning models. More specifically, there are well-known TL architectures that contain a huge amount of data. Therefore, the TL-based nut-type classification outperforms the solution generated via the custom CNN model. Moreover, the evaluations show that the accuracy ratio obtained by applying TL models reaches the limits. In other words, determining a proper TL model is adequate to develop a successful application. The process of using TL is much less time consuming and makes it effortless to seek out a CNN model with optimal hyperparameters. Hence, this study can motivate further research that would try to employ AI in different problem domains.

4. Conclusions

This study focuses on digitizing nut shops by providing a smart application for the classification of unpackaged nuts. For this purpose, known TL architectures, including Inception, MobileNet, EfficientNet, and ResNet, are used, and their performances are investigated to provide a convenient learning model on a newly created dataset. Moreover, a custom CNN model is evaluated on the same dataset. The dataset includes both manually captured images and the images retrieved from the internet. Eventually, the abilities of different models to recognize and classify unpackaged nuts are evaluated and discussed. According to the evaluation results, it is clearly seen that TL is an appropriate approach to provide a successful classifier application. The result of the evaluation is also validated through a desktop application developed using the Python programming language. The validation accuracy of TL models in the application is above 90% for almost all the tests and reaches 99% when the ResNet-50 is utilized. As a consequence, the promising results show that TL can be conveniently used to generate AI-based solutions for different problem domains, such as the nut-type classification investigated in this study.

The number of types of nuts included in the novel dataset is planned to be expanded in a future work. In this way, we intend to investigate the effect of the label count on the dataset. Accordingly, we will also produce a payment terminal as a kiosk hardware including mechanical components to use as the proposed model in the real world. Moreover, another application for mobile phones or wearable devices will be targeted in the future. To this end, lightweight TL architectures will be investigated.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://drive.google.com/drive/folders/1UeM38G7ev7fEJt71Qs13nfnDQy3MTJt2, accessed on 23 October 2023. The source codes of the application developed using the Python language and the learning models trained are publicly available via this link: https://github.com/hydonmez/Nuts-Classification, accessed on 23 October 2023.

Acknowledgments

This study is supported by the Scientific and Technological Research Council of Türkiye (TÜBİTAK) with the project ID 1919B012300710.

Conflicts of Interest

The author declares no conflict of interest.

References

Bhatt, K.; Kumar, S.M. Way Forward to Digital Society–Digital Transformation of Msmes from Industry 4.0 to Industry 5.0. In Proceedings of the 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), Mandya, India, 26–27 December 2022; pp. 1–6. [Google Scholar]
United Nations Publications (Ed.) Technology and Innovation Report 2023; United Nations Publications: New York, NY, USA, 2023. [Google Scholar]
Thayyib, P.V.; Mamilla, R.; Khan, M.; Fatima, H.; Asim, M.; Anwar, I.; Khan, M.A. State-of-the-Art of Artificial Intelligence and Big Data Analytics Reviews in Five Different Domains: A Bibliometric Summary. Sustainability 2023, 15, 4026. [Google Scholar] [CrossRef]
Mundhe, R.V.; Manwade, K.B. Continuous top-k monitoring on document streams. In Proceedings of the 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 11–12 July 2018; pp. 1008–1013. [Google Scholar]
Swamy, J.C.N.; Seshachalam, D.; Shariff, S.U. Smart RFID based Interactive Kiosk cart using wireless sensor node. In Proceedings of the 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS), Bengaluru, India, 6–8 October 2016; pp. 459–464. [Google Scholar]
Makinde, O.M.; Ayeni, K.I.; Sulyok, M.; Krska, R.; Adeleke, R.A.; Ezekiel, C.N. Microbiological safety of ready-to-eat foods in low-and middle-income countries: A comprehensive 10-year (2009 to 2018) review. Compr. Rev. Food Sci. Food Saf. 2020, 19, 703–732. [Google Scholar] [CrossRef]
Downs, S.M.; Fox, E.L.; Mutuku, V.; Muindi, Z.; Fatima, T.; Pavlovic, I.; Usain, S.; Sabbahi, M.; Kimenju, S.; Ahmed, S. Food Environments and Their Influence on Food Choices: A Case Study in Informal Settlements in Nairobi, Kenya. Nutrients 2022, 14, 2571. [Google Scholar] [CrossRef]
Hameed, K.; Chai, D.; Rassau, A. Class distribution-aware adaptive margins and cluster embedding for classification of fruit and vegetables at supermarket self-checkouts. Neurocomputing 2021, 461, 292–309. [Google Scholar] [CrossRef]
Alharbi, A.H.; Alkhalaf, S.; Asiri, Y.; Abdel-Khalek, S.; Mansour, R.F. Automated Fruit Classification using Enhanced Tunicate Swarm Algorithm with Fusion based Deep Learning. Comput. Electr. Eng. 2023, 108, 108657. [Google Scholar] [CrossRef]
Yunus, R.; Arif, O.; Afzal, H.; Amjad, M.F.; Abbas, H.; Bokhari, H.N.; Haider, S.T.; Zafar, N.; Nawaz, R. A Framework to Estimate the Nutritional Value of Food in Real Time Using Deep Learning Techniques. Inst. Electr. Electron. Eng. 2019, 7, 2643–2652. [Google Scholar] [CrossRef]
Park, S.J.; Palvanov, A.; Lee, C.H.; Jeong, N.; Cho, Y.I.; Lee, H.J. The development of food image detection and recognition model of Korean food for mobile dietary management. Nutr. Res. Pract. 2019, 13, 521–528. [Google Scholar] [CrossRef] [PubMed]
Rabby, M.K.M.; Chowdhury, B.; Kim, J.H. A modified canny edge detection algorithm for fruit detection & classification. In Proceedings of the IEEE 2018 10th International Conference on Electrical and Computer Engineering (ICECE), Dhaka, Bangladesh, 20–22 December 2018; pp. 237–240. [Google Scholar]
Vijayakanthan, G.; Kokul, T.; Pakeerathai, S.; Pinidiyaarachchi, U.A.J. Classification of vegetable plant pests using deep transfer learning. In Proceedings of the IEEE 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS), Negambo, Sri Lanka, 11–13 August 2021; pp. 167–172. [Google Scholar]
Taylor, E. Mobile payment technologies in retail: A review of potential benefits and risks. Int. J. Retail Distrib. Manag. 2016, 44, 159–177. [Google Scholar] [CrossRef]
Awan, H.U.M.; Pettenella, D. Pine nuts: A review of recent sanitary conditions and market development. Forests 2017, 8, 367. [Google Scholar] [CrossRef]
Adeigbe, O.O.; Olasupo, F.O.; Adewale, B.D.; Muyiwa, A.A. A review on cashew research and production in Nigeria in the last four decades. Sci. Res. Essays 2015, 10, 196–209. [Google Scholar] [CrossRef]
Kandaswamy, S.; Swarupa, V.M.; Sur, S.; Choubey, G.; Devarajan, Y.; Mishra, R. Cashew nut shell oil as a potential feedstock for biodiesel production: An overview. Biotechnol. Bioeng. 2023, 120, 3137–3147. [Google Scholar] [CrossRef]
Vidyarthi, S.K.; Singh, S.K.; Tiwari, R.; Xiao, H.W.; Rai, R. Classification of first quality fancy cashew kernels using four deep convolutional neural network models. J. Food Process Eng. 2020, 43, e13552. [Google Scholar] [CrossRef]
Moscetti, R.; Berhe, D.H.; Agrimi, M.; Haff, R.P.; Liang, P.; Ferri, S.; Monarca, D.; Massantini, R. Pine nut species recognition using NIR spectroscopy and image analysis. J. Food Eng. 2021, 292, 110357. [Google Scholar] [CrossRef]
Atban, F.; İlhan, H.O. Performance Evaluation of the Decision Level Fusion in Dried-Nut Species Classification. Eur. J. Sci. Technol. 2022, 45, 48–52. [Google Scholar]
Paleyes, A.; Urma, R.G.; Lawrence, N.D. Challenges in deploying machine learning: A survey of case studies. ACM Comput. Surv. 2022, 55, 1–29. [Google Scholar] [CrossRef]
Xie, W.; Li, S.; Xu, W.; Deng, H.; Liao, W.; Duan, X.; Wang, X. Study on the CNN model optimization for household garbage classification based on machine learning. J. Ambient Intell. Smart Environ. 2022, 14, 439–454. [Google Scholar] [CrossRef]
Xia, X.; Xu, C.; Nan, B. Inception-v3 for flower classification. In Proceedings of the 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, 2–4 August 2017; pp. 783–787. [Google Scholar]
Singh, T.; Vishwakarma, D.K. A deeply coupled ConvNet for human activity recognition using dynamic and RGB images. Neural Comput. Appl. 2021, 33, 469–485. [Google Scholar] [CrossRef]
Bouguezzi, S.; Fredj, H.B.; Belabed, T.; Valderrama, C.; Faiedh, H.; Souani, C. An efficient FPGA-based convolutional neural network for classification: Ad-MobileNet. Electronics 2021, 10, 2272. [Google Scholar] [CrossRef]
Zhu, F.; Gong, R.; Yu, F.; Liu, X.; Wang, Y.; Li, Z.; Yan, J. Towards unified int8 training for convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1969–1979. [Google Scholar]
Yu, W.; Lv, P. An End-to-End Intelligent Fault Diagnosis Application for Rolling Bearing Based on MobileNet. IEEE Access 2021, 9, 41925–41933. [Google Scholar] [CrossRef]
AlGarni, M.D.; AlRoobaea, R.; Almotiri, J.; Ullah, S.S.; Hussain, S.; Umar, F. An efficient convolutional neural network with transfer learning for malware classification. Wirel. Commun. Mob. Comput. 2022, 2022, 4841741. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Chhabra, M.; Kumar, R. An Efficient ResNet-50 based Intelligent Deep Learning Model to Predict Pneumonia from Medical Images. In Proceedings of the 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 20–22 March 2022; pp. 1714–1721. [Google Scholar]
Thi, H.D.; Andres, F.; Quoc, L.T.; Emoto, H.; Hayashi, M.; Katsumata, K.; Oshide, T. Deep Learning-Based Water Crystal Classification. Appl. Sci. 2022, 12, 825. [Google Scholar] [CrossRef]

Figure 1. Sample images from the dataset.

Figure 2. The working principle of the TL architecture used in this study.

Figure 3. The interaction of development tools used in the evaluation environment.

Figure 4. The running of the application with a test sample.

Figure 5. The change in the evaluation metrics with respect to the number of epochs when the custom CNN model is used.

Figure 6. The changes in the evaluation metrics with respect to the number of epochs when different TL models are used.

Figure 7. Best validation accuracy and best validation loss values obtained with respect to different learning models.

Table 1. The number of samples for each type of nut in the dataset.

Nut Type	Number of Images Captured Manually in a Nut Shop	Number of Images Obtained on the Internet	Total Number of Images
Pistachio	193 (77.2%)	57 (22.8%)	250 (20%)
Peanut	222 (88.8%)	28 (11.2%)	250 (20%)
Pumpkin seed	229 (91.2%)	21 (8.8%)	250 (20%)
Sunflower seed	201 (80.4%)	49 (19.6%)	250 (20%)
Roasted chickpea	217 (86.8%)	33 (13.2%)	250 (20%)
Total	1062 (84.96%)	188 (15.04%)	1250 (100%)

Table 2. Hyperparameters of different models for training.

Model	Input Size	Patience Value	Number of Epochs	Optimizer Used	Loss Function Used
CNN	300 × 300	7	50	Adam optimizer	Sparse categorical cross entropy
Inception-V3	229 × 229	10
MobileNet-V3	224 × 224	10
EfficientNet-V2	300 × 300	10
ResNet-50	300 × 300	10

Table 3. Evaluation results of different learning models.

Model	Loss	Accuracy	Validation Loss	Validation Accuracy	F1-Score	Best Epoch	Last Epoch
CNN	0.92	0.62	1.14	0.54	0.398	23	30
Inception-V3	0.08	0.99	0.38	0.84	0.781	45	50
MobileNet-V3	0.04	0.99	0.27	0.90	0.823	31	40
EfficientNet-B3	0.02	0.99	0.12	0.96	0.843	30	39
ResNet-50	0.01	1.00	0.10	0.96	0.852	23	32

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Özçevik, Y. A Real-Time Nut-Type Classifier Application Using Transfer Learning. Appl. Sci. 2023, 13, 11644. https://doi.org/10.3390/app132111644

AMA Style

Özçevik Y. A Real-Time Nut-Type Classifier Application Using Transfer Learning. Applied Sciences. 2023; 13(21):11644. https://doi.org/10.3390/app132111644

Chicago/Turabian Style

Özçevik, Yusuf. 2023. "A Real-Time Nut-Type Classifier Application Using Transfer Learning" Applied Sciences 13, no. 21: 11644. https://doi.org/10.3390/app132111644

APA Style

Özçevik, Y. (2023). A Real-Time Nut-Type Classifier Application Using Transfer Learning. Applied Sciences, 13(21), 11644. https://doi.org/10.3390/app132111644

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Real-Time Nut-Type Classifier Application Using Transfer Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Transfer Learning

2.2.1. Inception

2.2.2. MobileNet

2.2.3. EfficientNet

2.2.4. ResNet

2.3. Performance Evaluation

3. Results and Discussion

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI