Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images

Martínez-Mora, Omar; Capuñay-Uceda, Oscar; Caucha-Morales, Luis; Sánchez-Ancajima, Raúl; Ramírez-Morales, Iván; Córdova-Márquez, Sandra; Cuenca-Mayorga, Fabián

doi:10.3390/pr13071982

Open AccessArticle

Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images

by

Omar Martínez-Mora

^1,2,*

,

Oscar Capuñay-Uceda

³

,

Luis Caucha-Morales

²,

Raúl Sánchez-Ancajima

²

,

Iván Ramírez-Morales

¹

,

Sandra Córdova-Márquez

¹ and

Fabián Cuenca-Mayorga

¹

“Química & Alimentos” Research Group, Facultad de Ciencias Químicas y de la Salud, Universidad Técnica de Machala (UTMACH), Machala 070150, Ecuador

²

Departamento Académico de Matemática Estadística e Informática, Facultad de Ciencias Económicas, Universidad Nacional de Tumbes, Tumbes 24001, Peru

³

“GICDIAC—Grupo de Investigación en Ciencia de Datos, Inteligencia Artificial y Ciberseguridad”, Escuela Profesional de Ingeniería de Sistemas, Universidad Nacional Pedro Ruíz Gallo, Lambayeque 14013, Peru

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(7), 1982; https://doi.org/10.3390/pr13071982

Submission received: 17 October 2024 / Revised: 4 June 2025 / Accepted: 13 June 2025 / Published: 23 June 2025

(This article belongs to the Special Issue Innovative Strategies and Applications in Sustainable Food Processing)

Download

Browse Figures

Versions Notes

Abstract

The accurate classification of banana ripeness is essential for optimising agricultural practices and enhancing food industry processes. This study investigates the classification of banana ripeness using Machine Learning (ML) and Deep Learning (DL) techniques. The dataset consisted of 1565 high-resolution images of bananas captured over a 20-day ripening period using a Canon EOS 90D camera under controlled lighting and background conditions. High-resolution images of bananas at different ripeness stages were classified into ‘unripe’, ‘ripe’, and ‘overripe’ categories. The training set consisted of 1398 images (89.33%), and the validation set consisted of 167 images (10.67%), allowing for robust model evaluation. Various ML models, including Decision Tree, Random Forest, KNN, SVM, CNN, and VGG models, were trained and evaluated for ripeness classification. Among these, DL models, particularly CNN and VGG, outperformed traditional ML algorithms, with the CNN and VGG achieving accuracy rates of 90.42% and 89.22%, respectively. These rates surpassed those of Decision Trees (71.86%), Random Forests (85.63%), KNNs (86.83%), and SVMs (89.22%). The study points out the importance of dataset quality, model selection, and preprocessing techniques in achieving accurate ripeness classification. Practical applications of these results include optimised harvesting practices, enhanced post-harvest handling, improved consumer experience, streamlined supply chain logistics, and automation in sorting systems. These results confirm the feasibility of using deep learning for the automated classification of ripening stages, with implications for reducing postharvest losses and improving supply chain logistics. These findings have significant implications for stakeholders in the banana industry, from farmers to consumers, and pave the way for the development of innovative solutions for banana ripeness classification.

Keywords:

ripeness classification; Machine Learning; Deep Learning; CNNs; VGG; agricultural practices; food industry

1. Introduction

The classification of banana ripeness is essential in the agricultural and food industries due to its significant influence on production, distribution, and consumer satisfaction [1]. The accurate classification of banana ripeness enables farmers to determine the ideal harvesting time [2]. Harvesting bananas at the appropriate ripeness stage ensures superior flavour, texture, and nutritional content, enhancing consumer acceptance and marketability. One of the main challenges in banana ripeness classification lies in the subtle visual transitions between stages, compounded by lighting variability and the presence of spots or defects that may confound algorithms [3]. The precise identification of banana ripeness enables effective post-harvest handling and storage planning. Overripe bananas are more prone to bruising, damage, and spoilage during transportation and storage [4]. Accurate ripeness classification minimises post-harvest losses, boosting profitability for farmers [5]. In the food industry, bananas must be sorted and distributed according to their ripeness to meet consumer preferences and market demands. Categorising bananas into distinct ripeness stages allows distributors and retailers to ensure that consumers have access to bananas at their preferred ripeness level, thereby enhancing consumer satisfaction and marketability. Banana ripeness not only impacts taste but also texture and appearance.

Consumers have varying preferences regarding the ripeness of bananas they prefer to consume [1]. Efficient ripeness classification facilitates better inventory management and supply chain logistics, enabling more accurate demand forecasting and ensuring a consistent supply of bananas at various ripeness stages across the distribution chain [6]. Additionally, consistently delivering high-quality bananas that meet consumer expectations for ripeness enhances the reputation of the brand and may raise the trust and loyalty of consumers and producers alike.

The classification of banana ripeness using Machine Learning (ML) models, which involves training algorithms to detect patterns and make predictions from data, offering advantages in accuracy and automation over traditional classification approaches, represents a significant advancement in the fields of agriculture and the agri-food industry, enabling optimisation patterns from harvest to the distribution of this fruit [7]. Recent research has demonstrated notable progress in this field, applying various ML and computer vision techniques to accurately classify bananas in terms of ripeness levels; mobile applications based on computer vision have proven effective in this context. For instance, models like Convolutional Neural Networks (CNNs), i.e., deep learning architectures particularly effective in image analysis due to their capacity to learn the spatial hierarchies of features, have achieved remarkable precision in banana ripeness classification, reaching up to 98.25% accuracy [8]. Furthermore, dual classification systems combining the RGB colour model (RGB, meaning ‘Red’, ‘Green’, and ‘Blue’, is the standard colour model for digital images) and hyperspectral imaging (HSI, capturing data across a wide range of wavelengths, enabling a detailed analysis of material properties beyond what is visible to the human eye), thus allowing the model to use both visible and non-visible spectral data, have yielded impressive results with an accuracy of 98.4%, emphasising the value of integrating multiple imaging approaches for precise classification [9]. The use of pretrained models and transfer learning has also emerged as an effective strategy in this domain. This approach allows for rapid and accurate classifications with up to 98.65% accuracy for various fruits, including bananas [10]. Additionally, parameter optimisation in CNN models has been shown to improve accuracy in the detailed classification of banana ripening stages, achieving levels as high as 91.20%, pointing to the importance of setting models to the specific characteristics of the dataset [11]. In the field of non-invasive characterisation, hyperspectral imaging, coupled with ML techniques such as Extreme Gradient Boosting—XGBoost—has proven effective in distinguishing naturally ripened bananas from those treated artificially, achieving an accuracy of 91.16% [12]. Moreover, approaches based on image processing and ML have been employed to identify banana ripeness stages. Although hyperspectral imaging offers advantages, it was not chosen in this study due to its high cost, complex calibration requirements, and limited accessibility for real-time applications in resource-constrained settings. Utilising pretrained deep neural networks such as Inception V3, these methods have successfully classified fruit ripening stages, further enhancing the efficiency and accuracy of banana quality assessment [13]. The integration of these advanced techniques not only enhances our understanding of banana ripening but also offers practical solutions for producers, distributors, and consumers alike. The aim of this research work is to delve deeper into the physical, chemical, and functional attributes of bananas through the lens of artificial vision techniques, providing comprehensive insights into the multifaceted nature of bananas and their implications for various industries, such as post-harvesting, packaging, and processing.

2. Materials and Methods

The methodology implemented in this study is articulated in 5 phases designed to comprehensively address the banana ripeness classification process using advanced ML and image processing techniques.

2.1. Physicochemical Analyses Processing

The laboratory analyses consisted of the determination of the physicochemical characteristics of bananas during biochemical ripening changes. Banana samples were stored at room temperature (26 ± 2 °C) in a controlled environment. For each sample, physicochemical analysis was performed in triplicate at intervals of two days. Acidity was determined by the method AOAC, 942.15 [14], with results expressed as a percentage of malic acid. Soluble solids were measured using the refractometry method AOAC 932.12 [15], with results expressed in degrees Brix (°Brix). The reported values are the means of 3 essays. Ripeness index is the ratio of acidity to Brix degrees, as indicated in the following equation:

Ripeness index = (acidity) × (°Brix⁻¹),

(1)

The banana ripeness index was measured to identify the following stages: unripe (‘un’), ripe (‘ri’), and overripe (‘or’).

2.2. Image Acquisition

The image acquisition stage involves collecting a set of relevant images, differentiated among the different states of bananas (‘un’, ‘ri’, and ‘or’). Having an adequate volume of data segregated into training and validation sets is relevant to ensure the generalisation of the model [16]. A Canon EOS 90D camera (Canon Inc., Tokyo, Japan) with a resolution of 32.5 megapixels was employed to take the snapshots to be used. Photographs were taken at intervals of four hours, resulting in six images per day. The camera was positioned 1 m from the subject, with a white background as the backdrop. A tripod was used to ensure a vertical distance of 40 cm between the subject and the lens. Figure 1 depicts the camera setup used to capture snapshots of bananas at different stages.

2.3. Image Retrieving and Processing

In the image processing stage, a total of 1565 images were obtained. The analysis of high-resolution images imposed significant requirements on the processing and memory capacity of the hardware systems used. The use of colour images adds an additional layer of complexity as it was necessary to store and process the 3 dimensions corresponding to the colour channels (‘red’, ‘green’, and ‘blue’) [17]. To mitigate these challenges, a preprocessing strategy was implemented, consisting of reducing the size of the images to dimensions smaller than 250 × 250 pixels. This action alleviated computational load and converted the images into three-dimensional arrays, facilitating the subsequent separation of features and labels in the dataset [18]. This dimensionality reduction approach is a critical step in image processing, enabling more efficient manipulation without significantly compromising the quality of the information they contain. Furthermore, the conversion of images into 3D arrays and the distinction between features and labels are essential for the application of Machine Learning techniques, providing a suitable format for model training [19]. These procedures constitute an essential foundation for data preprocessing in the field of image analysis through Machine Learning, optimising the use of computational resources and improving the effectiveness of data processing. The Python 3.9 programming language was used for data processing. Specifically, data manipulation and analysis using the NumPy v1.23.5 and Pandas v1.5.3 libraries were applied. ML algorithms were developed and executed using these libraries, while computer vision tasks were performed with the OpenCV v4.7.0 library. Advanced preprocessing techniques were applied, including dimensionality reduction, conversion to three-dimensional arrays, and data normalisation. These operations are essential to prepare images for both model training and validation, facilitating better interpretation by ML algorithms [20].

2.4. Model Training

During the model training stage, using the training set, various ML algorithms were experimented with, such as the Decision Tree, Random Forest, K-Nearest Neighbours (KNN), Support Vector Machine (SVM), Convolutional Neural Network (CNN), and VGG19 algorithms. The objective is to evaluate and compare their performance to identify the most promising model for the task at hand. This process is essential to determine the optimal architecture that maximises classification accuracy [21]. Out of the total of 1565 images obtained, 1398 (89.33%) were used for the training stage.

2.5. Model Assessment and Evalidation

The evaluation of the model was carried out using the validation set once the training process was over. The images from this set were inputted into the model to predict and catalogue bananas into ‘un’, ‘ri’, and ‘or’ categories. The accuracy of the model, once computed, provides a quantitative assessment of its ability to assert correct classifications [22]. Out of the total of 1565 images obtained, 167 (10.67%) were used for the validation (testing) stage. The input, output, and response variables entered into the prediction algorithm are detailed in Table 1.

3. Results

3.1. Physicochemical Analyses

From the harvesting of the fruit, counted for the purposes of this study as day 0 to day 8, a green colour dominated the banana peel, indicating the unripe stage. According to the results of this research, the banana was considered ripe from day 10 to day 16, during which the best sensory attributes such as aroma, flavour, texture, and sweetness were exhibited; one of the clear hints of this stage is that the peel turns predominantly yellow during this period. Overripeness began on day 18, and the physicochemical characteristics were measured until day 20. Beyond this time, bananas no longer presented desirable attributes for consumption as fresh fruit. The analysis showed that acidity increased from 0.10% to 0.50% during the 20-day period in the model training set and from 0.12% to 0.52% in the model validation set. Soluble solids increased from 2°Brix to 27°Brix in the training set and from 1°Brix to 29°Brix in the validation set. Correspondingly, the ripeness index values ranged from 20 to 54 in the training set and from 8 to 56 in the validation set. The stages of ripeness were categorised as unripe (‘un’), ripe (‘ri’), and overripe (‘or’), with transitions occurring around days 10–12 for ripeness and days 16–18 for overripeness in both sets.

The physicochemical results of the bananas used for the training and validation stages are presented in Table 2 and Table 3, respectively.

3.2. Image Processing

The image acquisition and processing stages yielded a total of 1565 high-resolution images of bananas, categorised into three ripeness stages: unripe (‘un’), ripe (‘ri’), and overripe (‘or’). For the model training stage, 1398 images (89.33% of the total) were used, distributed as 537 unripe, 372 ripe, and 489 overripe bananas. During the model validation stage, 167 images (10.67% of the total) were utilised, comprising 97 unripe, 18 ripe, and 42 overripe bananas. The evaluation of the Machine Learning algorithms demonstrated the ability of the model to accurately classify the ripeness stages, with the accuracy and performance metrics indicating the effectiveness of the image processing and model training approaches employed. The detailed distribution of the dataset across the training and validation stages is presented in Table 4, highlighting the consistency in data segregation and ensuring robust model generalisation. Figure 1 depicts a collage of images of bananas in the natural ripening process.

3.3. Model Validation

The evaluation of ML and DL algorithms, widely applied in previous research, facilitated the preliminary selection of specific methods for further detailed inspection. The evaluated algorithms include Decision Tree (DT), Random Forest (RF), K-Nearest Neighbours (KNN), Support Vector Machine (SVM), Convolutional Neural Network (CNN), and VGG. The selection was based on their relevance and demonstrated effectiveness in previous works in the field [19,23]. Subsequently, an initial performance evaluation of these selected algorithms was carried out to identify the algorithm with the highest optimisation potential based on specific performance criteria such as accuracy, sensitivity, and specificity [24]. Table 5 provides the results of the accuracy of each ML and DL algorithm applied to the unspecified dataset. The CNN and VGG achieved accuracy rates of 90.42% and 89.22%, respectively, with standard deviations of ±1.4% and ±1.6%, indicating consistency across folds. Accuracy, defined as the proportion of correct predictions over the total number of cases, is a standard tool for evaluating classification models [25].

The DT model achieved an accuracy of 71.86%, which can be considered relatively low compared to the other methods. DTs are simple and interpretable models but often suffer from overfitting and do not generalise well to unseen data [26]. With an accuracy of 85.63%, Random Forests (RFs) show significantly better performance than DTs. This is expected since RFs are ensembles of Decision Trees designed to improve robustness and reduce overfitting by combining predictions from multiple trees [23]. The KNN algorithm showed slightly higher accuracy than RF, with an accuracy of 86.83%. This suggests that, for the dataset in question, classification based on proximity to the nearest neighbours is effective. The KNN’s performance can vary significantly with the choice of the number of neighbours and the distance metric used [27]. The SVMs showed an accuracy of 89.22%, indicating robust ability to find the optimal hyperplane separating classes in feature space, especially in high-dimensional cases [28]. With the highest accuracy of 90.42%, the CNN demonstrated strength in classification tasks, likely due to its ability to learn feature hierarchies through multiple layers of convolution and pooling, which is particularly powerful for visual and spatially structured data [29]. The VGG algorithm matched the SVM with an accuracy of 89.22%. VGG is a deep neural network architecture known for its simplicity and depth, allowing it to capture complex features. Its performance here suggests that it is competitive with SVMs for this particular dataset [30]. DL techniques such as CNN and VGG outperform more traditional and simple ML methods. However, the accuracy was not the only performance metric and should be complemented with other metrics such as the confusion matrix, precision, recall, and F1 score for a comprehensive evaluation of model performance. Additionally, model interpretability, training time, and generalisation capability are important aspects to consider in the final algorithm selection for practical applications [31]. Figure 2 and Figure 3 illustrates the evolution of accuracy during the training and validation of a Convolutional Neural Network (CNN) algorithm over various epochs in the learning process. Epochs represent complete iterations of the training dataset through the neural network. Both the training accuracy (blue line) and validation accuracy (red line) experience significant improvement in the early epochs. This is typical of the initial phase of neural network training, where the model quickly learns the general features of the dataset [19]. The blue line representing training accuracy quickly reached a high level and remained relatively constant, close to 100%. This suggests that the model efficiently learned the features of the training dataset. However, very high training accuracy can be an indicator of overfitting, where the model learns specific patterns of the training dataset that do not generalise well to new data [32].

There was a notable gap between training accuracy and validation accuracy, which persisted over time. This gap may indicate overfitting as the model seems to fail when generalising to unseen data. Ideally, the lines of training and validation accuracy were desired to be closer, therefore indicating a good generalisation of the model [33]. Validation accuracy initially increased but then stabilised around 85–90%. Any significant improvement after around 10 epochs was detected, suggesting that continuing training beyond this point may result in any significant accuracy gains and could even lead to increased overfitting if training accuracy continues to increase while validation accuracy plateaus or decreases [34]. The confusion matrix is a fundamental tool in analysing the effectiveness of classification models, providing a simple visual inspection of prediction accuracy, as well as the type and quantity of errors made by the classifier [35]. In the case of a Convolutional Neural Network (CNN) model for classification, the matrix provides valuable information on how the model distinguishes between different classes. The confusion matrix obtained, a practical tool for understanding the effectiveness and potential areas for improvement of the CNN model in this context, is depicted in Figure 4.

4. Discussion

Upon the confusion matrix, the classes were identified with the labels ‘un’, ‘ri’, and ‘or’. The matrix shows both the true labels on the y-axis and the predicted labels on the x-axis. The values on the main diagonal (45 for ‘un’, 4 for ‘ri’, and 11 for ‘or’) represent correct predictions, i.e., when the true label and the predicted label were equal. Values off the main diagonal represent incorrect classifications, e.g., the model predicted the ‘ri’ class 18 times when it was actually ‘un’ and predicted the ‘un’ class 29 times when it was actually ‘or’. When evaluating the accuracy per class, for the ‘un’ class, the model featured a high accuracy rate (45 correct out of a total of 97 for this class). The ‘ri’ class had a very low accuracy rate (circa four correct), suggesting that the model struggled to identify this class correctly. The ‘or’ class had a moderate accuracy rate (11 correct out of a total of 52 for this class). Concerning the common errors identified, there was a considerable confusion between ‘un’ and ‘or’, with 29 misclassified instances. This could indicate similarity in the features learned by the model for these two classes. The ‘ri’ class was often confused with the other two classes, either suggesting that the model underperformed when learning the distinctive features for ‘ri’ or that there were not enough representative examples for ‘ri’ in the training set. Some of the causes attributed to these issues were overfitting, provided training was carried out with a limited or poorly diverse dataset, so the model may overfit to specific classes and not generalise well. The imbalance of the classes may be ascribed to the occurrence of several more instances of some classes compared to others, showing that the model may be biassed towards the more frequent classes. To mitigate class imbalance, future studies should explore resampling techniques such as SMOTE, undersampling, or class weighting during model training. Regarding similarities featured, if the ‘un’ and ‘or’ classes were inherently similar in the features the model learned, more sophisticated feature engineering or a different network architecture that can capture more subtle differences may be required. To address these issues, certain actions may be taken such as reviewing the balance of classes in the training set, applying data augmentation techniques to improve model generalisation, and refining the network architecture or hyperparameters to improve the distinction between similar classes. The results of this study demonstrate the effectiveness of various ML models in accurately classifying banana ripeness. The highest accuracy achieved by the CNN and VGG models suggests that DL techniques outperformed traditional Machine Learning algorithms for this task; this is analogous to the results of previous studies [36,37]. The accuracy of ripeness classification varies among the models tested, with the CNN and VGG achieving the highest accuracy rates. This indicates that the complex features learned by the DL models, such as CNN and VGG, contribute to a more accurate classification of banana ripeness compared to simpler models like Decision Trees and KNNs, as previously stated in other studies [38]. The findings of this study aligned with previous research that demonstrated the effectiveness of ML and DL techniques for fruit ripeness classification. High accuracy rates using CNNs and other advanced models for fruit ripeness classification tasks have been reported previously [8,9,10]. The importance of dataset quality and model selection in achieving accurate ripeness classification is highlighted. The consistent improvement in classification accuracy with CNN models across multiple studies features the important role of advanced architectures and data augmentation in handling the complexity of ripeness stages. Recent research achieving accuracies close to 90% further validates the practical applicability of these models for real-time banana ripeness detection, which is vital for reducing post-harvest losses and optimising supply chain management [39,40].

5. Conclusions

It was observed that the CNN and VGG architectures outperformed traditional ML algorithms, achieving high accuracy rates in ripeness classification and significantly improving harvesting practices, e.g., determining the precise timing for harvesting, ensuring optimal fruit quality, and minimising post-harvest losses; they also enhanced the overall consumer experience and consumer satisfaction with banana products. The high accuracy of the CNN and VGG models opens possibilities for automation in sorting systems. These models can be integrated into sorting machines to classify bananas automatically, increasing efficiency and reducing labour costs. The trained CNN and VGG models can be deployed in mobile applications (‘apps’). Although hyperspectral images were not used in this study, future work shall consider HSI integration. These applications can provide real-time ripeness assessments, allowing users to make informed decisions about harvesting, storage, and distribution. By offering bananas at different ripeness stages, retailers can meet diverse consumer preferences and minimise waste. The results of this study have practical implications for various stakeholders in the banana industry from farmers to consumers. By leveraging advanced ML and DL techniques, the classification of banana ripeness can be optimised, leading to improved agricultural practices, enhanced product quality, and more efficient supply chain management.

Author Contributions

Conceptualisation, O.M.-M. and S.C.-M.; Data Curation, S.C.-M. and F.C.-M.; Formal Analysis, R.S.-A.; Funding Acquisition, O.M.-M.; Investigation, O.C.-U., L.C.-M., and R.S.-A.; Methodology, O.M.-M., O.C.-U., L.C.-M., R.S.-A., I.R.-M., S.C.-M., and F.C.-M.; Project Administration, O.M.-M.; Resources, O.M.-M.; Software, O.C.-U., L.C.-M., and R.S.-A.; Validation, R.S.-A.; Visualisation, O.M.-M.; Writing—Original Draft, O.M.-M., O.C.-U., L.C.-M., and R.S.-A.; Writing—Review and Editing, S.C.-M. and F.C.-M. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the Technical University of Machala (UTMACH), through its Office of Research, for the financial support provided to carry out this research.

Data Availability Statement

The data that support the findings of this study are available upon request from the corresponding author. The data are not publicly available due to privacy restrictions.

Acknowledgments

The authors would like to express sincere gratitude to the Technical University of Machala (UTMACH), Ecuador, for providing the essential funding required for the development of this research. Additionally, the authors extend their thanks to the National University of Tumbes, Peru, for their invaluable technical advice and guidance. Their combined support was key in facilitating our work and enabling us to achieve our objectives.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Symmank, C.; Zahn, S.; Rohm, H. Visually suboptimal bananas: How ripeness affects consumer expectation and perception. Appetite 2018, 120, 472–481. [Google Scholar] [CrossRef] [PubMed]
Mazen, F.; Nashat, A. Ripeness classification of bananas using an artificial neural network. Arab. J. Sci. Eng. 2019, 44, 6901–6910. [Google Scholar] [CrossRef]
Arvanitoyannis, I.; Mavromatis, A. Banana cultivars, cultivation practices, and physicochemical properties. Crit. Rev. Food Sci. Nutr. 2009, 49, 113–135. [Google Scholar] [CrossRef]
Pathare, P.; Al-Dairi, M. Effect of mechanical damage on the quality characteristics of banana fruits during short-term storage. Discov. Food 2022, 2, 4. [Google Scholar] [CrossRef]
Singh, A.; Vaidya, G.; Jagota, V.; Darko, D.A.; Agarwal, R.K.; Debnath, S.; Potrich, E. Recent advancement in postharvest loss mitigation and quality management of fruits and vegetables using machine learning frameworks. J. Food Qual. 2022, 2022, 6447282. [Google Scholar] [CrossRef]
Xie, R.; Zhang, Y.; Luo, H.; Yu, P.; Chen, Z. Optimizing decisions for post-harvest ripening agricultural produce supply chain management: A dynamic quality-based model. Int. Trans. Oper. Res. 2023, 30, 3625–3653. [Google Scholar] [CrossRef]
Almeyda, E.; Ipanaqué, W. Recent developments of artificial intelligence for banana: Application areas, learning algorithms, and future challenges. Eng. Agríc. 2022, 42, e20210144. [Google Scholar] [CrossRef]
Mohamedon, M.F.; Rahman, F.A.; Mohamad, S.Y.; Khalifa, O.O. Banana Ripeness Classification Using Computer Vision-based Mobile Application. In Proceedings of the 8th International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia, 22–23 June 2021; pp. 1–6. [Google Scholar]
Raghavendra, S.; Souvik, G.P.T.; Mitali Madhusmita, N.; Sushovan, C.; Randell, U.E.; Isaac, O. Deep Learning Based Dual Channel Banana Grading System Using Convolution Neural Network. J. Food Qual. 2022, 6050284. [Google Scholar] [CrossRef]
Seyed-Hassan, M.A.; Shima, J.; Mehrdad, J.; Martynenko, A.; Verbeek, F.J. Detection of Mulberry Ripeness Stages Using Deep Learning Models. IEEE Access 2021, 9, 100380–100394. [Google Scholar]
Cahya, Z.; Cahya, D.; Nugroho, T.; Zuhri, A.; Agusta, W. CNN Model with Parameter Optimisation for Fine-Grained Banana Ripening Stage Classification. In Proceedings of the IC3INA ‘22, Virtual, 22–23 November 2022; pp. 90–94. [Google Scholar]
Weiwen, H.; Hongyuan, H.; Wang, F.; Wang, S.; Li, R.; Chang, J.; Li, C. Rapid and Uninvasive Characterization of Bananas by Hyperspectral Imaging with Extreme Gradient Boosting (XGBoost). Anal. Lett. 2022, 55, 620–633. [Google Scholar]
Wankhade, M.; Hore, U.W. Banana Ripeness Classification Based on Image Processing with Machine Learning. IJARSCT 2021, 6, 1390–1398. [Google Scholar] [CrossRef]
AOAC International. Official Method 942.15: Acidity (Titratable) of Fruit Products. In Official Methods of Analysis of AOAC International; AOAC International: Rockville, MD, USA, 2023. [Google Scholar]
AOAC International. Official Method 932.12: Solids (Soluble) in Fruits and Fruit Products—Refractometer Method. In Official Methods of Analysis of AOAC International; AOAC International: Rockville, MD, USA, 1932; 1932 (First Action); Final Action 1980. [Google Scholar]
Zhu, X.; Goldberg, A.B. Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 2009, 3, 1–130. [Google Scholar]
Gonzalez, R.; Woods, R. Digital Image Processing, 3rd ed.; Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2008; pp. 400–405. [Google Scholar]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning, 1st ed.; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Chhikara, P.; Jain, N.; Tekchandani, R.; Kumar, N. Data dimensionality reduction techniques for Industry 4.0: Research results, challenges, and future research directions. Soft. Pract. Exper. 2022, 52, 658–688. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 1, 1097–1105. [Google Scholar] [CrossRef]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Kohavi, R.; Provost, F. Glossary of terms. Mach. Learn. 1998, 30, 271–274. [Google Scholar]
Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Hinton, G. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Domingos, P. A few useful things to know about machine learning. Commun. ACM 2012, 55, 78–87. [Google Scholar] [CrossRef]
Hawkins, D.M. The problem of overfitting. J. Chem. Inf. Comput. Sci. 2004, 44, 1–12. [Google Scholar] [CrossRef]
Kawaguchi, K.; Kaelbling, L.P.; Bengio, Y. Generalization in deep learning. arXiv 2017, arXiv:1710.05468. [Google Scholar]
Prechelt, L. Neural Networks: Tricks of the Trade; Springer: Heidelberg, Germany, 1998; pp. 53–67. [Google Scholar]
Deng, X.; Liu, Q.; Deng, Y.; Mahadevan, S. An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Inf. Sci. 2016, 340, 250–261. [Google Scholar] [CrossRef]
Paterakis, N.; Mocanu, E.; Gibescu, M.; Stappers, B.; van Alst, W. Deep learning versus traditional machine learning methods for aggregated energy demand prediction. In Proceedings of the IEEE PES ISGT-Europe, Turin, Italy, 26–29 September 2017; pp. 1–6. [Google Scholar]
Rahimzad, M.; Moghaddam Nia, A.; Zolfonoon, H.; Soltani, J.; Danandeh Mehr, A.; Kwon, H. Performance comparison of an LSTM-based deep learning model versus conventional machine learning algorithms for streamflow forecasting. Water Resour. Manag. 2021, 35, 4167–4187. [Google Scholar] [CrossRef]
Shuprajhaa, T.; Raj, J.M.; Paramasivam, S.K.; Sheeba, K.N.; Uma, S. Deep learning based intelligent identification system for ripening stages of banana. Postharvest Biol. Technol. 2023, 203, 112410. [Google Scholar] [CrossRef]
Francisco, R.S.; Pedroso, G.S.G.; Ventura, T.M. Multi-Class CNN Models for Banana Ripeness Classification. J. Syst. Eng. Inf. Technol. 2025, 4, 1. [Google Scholar]
Arunima, P.L.; Gopinath, P.P.; Lekshmi, P.R.G.; Esakkimuthu, M. Digital Assessment of Post-Harvest Nendran Banana for Faster Grading: CNN-Based Ripeness Classification Model. Postharvest Biol. Technol. 2024, 214, 112972. [Google Scholar] [CrossRef]

Figure 1. Camera setup to capture snapshots of bananas at different stages.

Figure 2. Stages of banana ripeness captured through image acquisition process. (a) Days 0–8, unripe stage (‘un’); (b) days 10–16 ripe stage (‘ri’); (c) days 18–20, overripe stage (‘or’).

Figure 3. The accuracy in the 25 epochs applied to the CNN.

Figure 4. The confusion matrix of the model obtained with the CNN algorithm.

Table 1. Definitions of response variables.

Variable	Definition
Xtrain	Physicochemical parameters for training (acidity, °Brix, and ‘un’)
Xtest	Physicochemical parameters for validation (acidity, °Brix, and ‘un’)
Ytrain	Banana ripeness status (‘un’, ‘ri’, and ‘or’) in training
Ytest	Banana ripeness status (‘un’, ‘ri’, and ‘or’) in validation

Table 2. Physicochemical parameters determined in bananas for model training.

Parameter	D0	D2	D4	D6	D8	D10	D12	D14	D16	D18	D20
Acidity (%)	0.10 (0.03)	0.20 (0.04)	0.25 (0.01)	0.30 (0.03)	0.33 (0.01)	0.39 (0.04)	0.40 (0.02)	0.43 (0.02)	0.45 (0.03)	0.48 (0.02)	0.50 (0.01)
Soluble solids (°Brix)	2 (0.01)	5 (0.01)	7 (0.02)	9 (0.01)	11 (0.03)	14 (0.02)	15 (0.01)	17 (0.01)	20 (0.03)	25 (0.04)	27 (0.01)
Ripeness index	20	25	28	30	33	36	38	40	47	52	54
Stage	Un	un	un	un	un	ri	ri	ri	ri	or	or

The reported values are the means of 3 essays; the standard deviation per datum is indicated in parentheses; ‘un’, unripe; ‘ri’, ripe; ‘or’, overripe.

Table 3. Physicochemical parameters determined in bananas for model validation.

Parameter	D0	D2	D4	D6	D8	D10	D12	D14	D16	D18	D20
Acidity (%)	0.12 (0.02)	0.24 (0.01)	0.25 (0.03)	0.28 (0.03)	0.37 (0.01)	0.38 (0.05)	0.42 (0.02)	0.44 (0.07)	0.47 (0.02)	0.48 (0.02)	0.52 (0.05)
Soluble solids (°Brix)	1 (0.03)	4 (0.02)	6 (0.02)	10 (0.01)	14 (0.05)	16 (0.02)	18 (0.07)	20 (0.01)	23 (0.07)	26 (0.09)	29 (0.04)
Ripeness index	8	17	24	35	37	42	43	45	49	54	56
Stage	un	un	un	un	ri	ri	ri	ri	ri	or	or

The reported values are the means of 3 essays; the standard deviation per datum is indicated in parentheses; ‘un’, unripe; ‘ri’, ripe; ‘or’, overripe.

Table 4. Dataset of snapshots used in model training and model validation stages distributed according to their categorisation.

Repository	‘un’	‘ri’	‘or’	Total	Percentage (%)
Training	537	372	489	1389	89.33
Validation	97	18	42	167	10.67
Total	-	-	-	1565	100

Table 5. Accuracy obtained by each model.

Model	Accuracy (%)
DT	71.86
RF	85.63
KNN	86.83
SVM	89.22
CNN	90.42
VGG	89.22

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Martínez-Mora, O.; Capuñay-Uceda, O.; Caucha-Morales, L.; Sánchez-Ancajima, R.; Ramírez-Morales, I.; Córdova-Márquez, S.; Cuenca-Mayorga, F. Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images. Processes 2025, 13, 1982. https://doi.org/10.3390/pr13071982

AMA Style

Martínez-Mora O, Capuñay-Uceda O, Caucha-Morales L, Sánchez-Ancajima R, Ramírez-Morales I, Córdova-Márquez S, Cuenca-Mayorga F. Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images. Processes. 2025; 13(7):1982. https://doi.org/10.3390/pr13071982

Chicago/Turabian Style

Martínez-Mora, Omar, Oscar Capuñay-Uceda, Luis Caucha-Morales, Raúl Sánchez-Ancajima, Iván Ramírez-Morales, Sandra Córdova-Márquez, and Fabián Cuenca-Mayorga. 2025. "Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images" Processes 13, no. 7: 1982. https://doi.org/10.3390/pr13071982

APA Style

Martínez-Mora, O., Capuñay-Uceda, O., Caucha-Morales, L., Sánchez-Ancajima, R., Ramírez-Morales, I., Córdova-Márquez, S., & Cuenca-Mayorga, F. (2025). Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images. Processes, 13(7), 1982. https://doi.org/10.3390/pr13071982

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Vision-Based Dual CNN Classification of Banana Ripeness and Quality Attributes Using RGB Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Physicochemical Analyses Processing

2.2. Image Acquisition

2.3. Image Retrieving and Processing

2.4. Model Training

2.5. Model Assessment and Evalidation

3. Results

3.1. Physicochemical Analyses

3.2. Image Processing

3.3. Model Validation

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI