Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms

Muzamal, Muhmmad; Hussain, Manzoor; De Wibowo, Aryo

doi:10.3390/engproc2025107116

Open AccessProceeding Paper

Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms^†

by

Muhmmad Muzamal

^1,*

,

Manzoor Hussain

²

and

Aryo De Wibowo

³

¹

Department of Software Engineering, University of Sialkot, Sialkot 51040, Pakistan

²

Department of Computer Science, Indus University, Karachi 75300, Pakistan

³

Department of Electrical Engineering, Nusa Putra University, Sukabumi 43152, West Java, Indonesia

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.

Eng. Proc. 2025, 107(1), 116; https://doi.org/10.3390/engproc2025107116

Published: 26 September 2025

(This article belongs to the Proceedings of The 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society)

Download

Browse Figures

Versions Notes

Abstract

The quality of mangoes is a crucial factor in both domestic and commercial markets that directly influences consumer satisfaction and economic value. Traditional methods of checking mango quality often involve destructive techniques, which lead to the loss of the fruit in the testing process. This study presents an advanced approach that could predict the quality of mangoes using advance non-destructive methods leveraging machine learning algorithms to predict quality parameters such as ripeness, sweetness and overall freshness without damaging the fruit. In this research, a dataset consisting of various mango samples was collected, with attributes including color, texture, size, weight and acidity levels. Sensors, such as pH sensors (for acidity) and e-nose sensors (for aroma and sweetness detection), were used to gather data, while a combination of machine learning models such as Decision Tree, K-Nearest Neighbors (KNN), and Automated Machine Learning (AutoMLP), Naive Bayes were applied to predict the mangoes’ quality. The accuracy of each model was measured based on its ability to classify mangoes as fresh, ripe, or rotten. The results determine that the AutoMLP model performs the best out of the traditional models, achieving an accuracy of 98.46%, making it the most suitable model for mango quality prediction. The research explains the significance of feature extraction methods, model optimization, and sensor data pretreatment in reaching a high prediction accuracy.

Keywords:

mango quality; non-destructive testing; machine learning; ripeness prediction; sensor technology; pH sensor; electronic nose (e-nose); AutoMLP; feature extraction; classification accuracy

1. Introduction

The increasing global demand for high-quality agricultural produce has necessitated the adoption of innovative methods to ensure product integrity, reduce waste, and optimize supply chains. Mangoes, recognized as the “king of fruits,” hold a prominent position in the global fruit market due to their rich flavor, nutritional benefits, and economic value. However, maintaining mango quality throughout the supply chain remains a significant challenge, primarily because of their perishable nature and sensitivity to environmental conditions. Traditional methods for assessing mango quality rely heavily on destructive testing techniques, including physical slicing, chemical analysis, and visual inspection. These methods, while accurate, are not scalable for large-scale operations and often lead to the significant wastage of produce. Additionally, manual inspections are prone to human error and subjectivity, resulting in inconsistencies in quality grading. This necessitates the development of non-destructive, objective, and automated systems for fruit quality assessment. Combining machine learning (ML) with sensor technology has become a viable way to tackle these issues in recent years. Nondestructive sensors, such as electronic noses (e-nose) for detecting sweetness, pH meters for measuring acidity, and color sensors for analyzing visual attributes, provide a wealth of data that can be leveraged to predict fruit quality accurately. Machine learning models have demonstrated their ability to process this data, uncover hidden patterns, and classify fruits based on quality attributes without causing physical damage. This research focuses on utilizing machine learning approaches to predict mango quality using sensor-based data. A dataset containing features such as sweetness, acidity, ripeness, sweetness, and juiciness was utilized to train and evaluate multiple machine learning models, Decision Tree, K-Nearest Neighbors (KNN), Naive Bayes and an Automated Machine Learning (AutoMLP). Among these models, AutoMLP achieved a superior accuracy of 98.46%, demonstrating how well it handles intricate, non-linear relationships in the data.

2. Literature Review

The quality assessment of fruits, particularly mangoes, has traditionally relied on destructive testing methods, including chemical analysis and physical sampling. However, these methods are labor intensive, time consuming and unsuitable for large-scale applications. This has led to a growing interest in non-destructive techniques, which aim to preserve the fruit’s integrity while providing accurate quality measurements. Ref. [1] overcame the drawbacks of conventional HPLC techniques by using DRIFT-FTIR spectroscopy to forecast fructose and glucose concentrations in Kenyan mango cultivars. With an increasing sugar content in fruits exposed to the sun, the study demonstrated significant predictive performance; however, that for sucrose and maltose was less accurate.

Paiva-Peredo and associates used near-infrared spectroscopy in 2023 to evaluate the dry matter of mangoes using conventional methods. The effectiveness of the best model, PLSR with MSC pre-processing, is demonstrated by its RMSE of 1.6142 percent DM [2]. In 2022, Capela and colleagues overcame the shortcomings of current molecular theories by using deep learning to predict sweetness from compound chemical structures. Using the largest library of bitter and sweet substances, they created deep learning models that improved accuracy and scalability by discovering 67,724 possible sweeteners from PubChem [3]. Srisungsittisunti [4] used near-infrared spectroscopy to predict Brix values in mangoes, improving accuracy through forward feature selection and ensemble models. The study used 120-day data for training and highlights the advantages of ripe fruit data. The 2013 study by Pornprasit and Natwichai shows that reliable and scalable models are required for utilizing near-infrared spectroscopy to forecast the quality of mango fruit. They provide an ensemble classification method that uses weighted sub-classifiers to achieve better scalability and accuracy in dynamic agricultural settings [4]. Ref. [5] developed a hybrid system for the classification of mango ripeness, combining image processing and odor sensing. The system achieved 94.69% accuracy, outperforming standalone techniques and significantly improving reliability and scalability for diverse agricultural applications, outperforming previous methods. Ref. [6] developed a non-destructive method for detecting TSS in mangoes using machine learning and reflectance spectroscopy. They compared transformations and preprocessing techniques to determine which model performed better than PCR and to demonstrate the superiority of specific preprocessing techniques. Ref. [7] used near-infrared spectroscopy and regression modeling to assess mango quality. With the lowest error rates and the highest R² values, they found that MLPR was more accurate at predicting pH and TSS, proving its dependability for accurate, long-term assessments of agriculture quality. Multiple studies highlight the effectiveness of non-destructive approaches like NIR spectroscopy and machine learning for assessing mango quality parameters, such as TSS, pH and ripeness. Handheld spectrometers coupled with advanced algorithms have shown promising results for rapid and accurate evaluation [8]. Machine learning models like SVM, Random Forest and FANN have been increasingly adopted for mango grading. These methods provide significant accuracy improvements over traditional classification, addressing inconsistencies and inefficiencies [9]. Technologies such as low-cost multispectral sensors and portable spectrometers are paving the way for practical applications in agriculture. These technologies offer capable solutions for grading and sorting fruit quality with low costs [10].

3. Methodology

The proposed methodology for this research study entails using machine learning (ML) classifiers inside an integrated framework to detect mango quality.

3.1. Models

3.1.1. K-Nearest Neighbors

A simple instance-based learning method for classification and regression problems is K-Nearest Neighbors (KNN). Using the average value or majority class of its closest neighbors, it makes an educated prediction as to the class or value of a new data point. The number of neighbors to consider is indicated by the “K” in KNN. KNN compares the distance between new and existing features to generate a prediction. The KNN ruler assists you in determining the ideal location for the new feature, allowing you to put it where it will blend in the most, usually using Euclidean distance.

3.1.2. Naïve Bayes

Naive Bayes is a machine learning technique that applies Bayes’ theorem under the “naive” assumption of feature independence. It is frequently used for categorization, especially with high-dimensional data. The technique computes the probability of a data point belonging to each class based on the likelihood of its features falling into that class, provided that features are conditionally independent.

3.1.3. Decision Tree

One supervised machine learning technique for classification and regression is the decision tree. Each internal node represents a test on an attribute, each branch shows the test result, and each leaf node indicates a class name or numerical value. This tree structure illustrates decisions. Because they are straightforward and aesthetically pleasing, decision trees are helpful for comprehending decision-making processes. Entropy must be calculated before building a decision tree. This is achieved by examining the distribution of class labels to determine the dataset’s level of uncertainty.

3.1.4. AutoMLP

The term “AutoMLP” describes a method for automatically creating and training Multi-Layer Perceptron’s (MLPs) for machine learning applications. It blends the strength of MLP with architectures that use automated procedures to maximize performance, saving manual involvement in model configuration and hyperparameter adjustment.

3.1.5. Neural Network

A neural network is a computer model that draws inspiration from the composition and operations of the human brain. In order to solve different issues like classification, regression, and clustering, it is made up of interconnected layers of nodes (neurons) that process input and extract patterns. A key element of deep learning is neural networks, which give computers the ability to recognize intricate patterns in data.

4. Framework

A machine learning framework is an interface that enables developers to design and deploy machine learning models more rapidly and easily. It usually consists of a sequence of processes that start with data collection and preprocessing via model selection and training and concluding with model evaluation and deployment. The framework for this study involves several key steps for developing a mango quality prediction system using machine learning models. The first step was dataset preparation, where mango samples were collected and categorized as Fresh, Ripe or Rotten based on expert evaluations. Sensor data, including e-nose readings for aroma and pH readings for acidity, along with physical attributes such as color, weight, and size, were recorded. The dataset was then preprocessed by cleaning noisy data, normalizing numerical features and creating derived features like a sweetness index and ripeness score. It was split into training (80%) and testing (20%) subsets to ensure balanced class representation. Four machine learning models were applied: Decision Tree provided interpretable results, Naive Bayes was efficient for mixed data and KNN captured local relationships. AutoMLP, a deep learning model with automated hyperparameter tuning, achieved the highest accuracy of 98.46%, outperforming other models by effectively capturing complex patterns in the data. The models were assessed using metrics like accuracy, recall with precision, and F1-score. AutoMLP emerged as the best-performing model and was deployed as the core of a real-time mango quality prediction system. This system uses sensor data and physical attributes as input to classify mangoes non-destructively, making it suitable for agricultural and retail applications.

5. Dataset Description

The mango dataset used in this study is a structured and comprehensive collection designed to predict mango quality accurately and non-destructively. It contains 8000 samples, each characterized by seven numerical features and one categorical label. The dataset includes the physical and chemical attributes critical for determining mango quality. Key features include MangoSize, MangoWeight and MangoSoftness, which provide insights into the physical properties of the fruit. MangoSweetness and MangoAcidity capture the fruit’s chemical composition, offering a direct correlation with taste and freshness. MangoHarvestTime and MangoRipeness are additional indicators that assess the fruit’s maturity and readiness for consumption. The target variable MangoQuality categorizes the mangoes into quality grades such as A-Grade, serving as the benchmark for model predictions. The dataset is well-prepared and has no missing values, ensuring consistency and reliability for machine learning applications. The numerical features are normalized, which enhances the performance of the predictive models by ensuring that all features contribute proportionately to the output. This dataset is an ideal resource for training and evaluating machine learning algorithms due to its rich combination of physical, chemical, and categorical data. It supports the development of a robust, non-destructive mango quality prediction system that can be applied in agricultural and retail settings for efficient quality control.

5.1. Retrieve Mango Data

This is the initial step where the mango dataset is imported into the system. The dataset contains key features such as size, weight, sweetness, softness, acidity, ripeness, and quality labels. The input data is loaded for preprocessing and model training.

5.2. Replace Missing Values

Any incomplete or missing values in the dataset are dealt with in this stage. This phase protects data integrity by substituting suitable values (such as the column’s mean, median, or mode) for missing entries, which might impair the performance of machine learning models.

5.3. Split Data

Subsets of the dataset are used for testing and training. To guarantee that there is sufficient data for model training and to set aside some for assessing the model’s performance on unseen data, the split is usually carried out in an 80–20 or 70–30 ratio.

5.4. Training

This is the main training model for machine learning. The optimal architecture and hyperparameters, including the number of layers, neurons, activation functions, and learning rate, are automatically chosen by the neural network-based method known as AutoMLP (Automated Multi-Layer Perceptron). AutoMLP learns patterns and correlations between input characteristics and the target labels by using the training data from the previous stage.

5.5. Apply Model

The AutoMLP model is applied to the testing data after it has been trained. This stage generates the projected labels by using the training model to forecast mango quality for the unseen testing subset.

5.6. Performance Evaluation

The model’s performance is assessed in the last stage as shown in Figure 1. The model’s ability to predict mango quality is evaluated using metrics including accuracy, precision, recall, F1-score, and confusion matrix. This phase offers a numerical assessment of the model’s efficacy and identifies areas for improvement.

6. Results

This research introduces an innovative, non-destructive approach to assessing mango quality using machine learning and advanced sensor technologies. By leveraging e-nose sensors to measure sweetness and aroma, color sensors to detect surface hues, and pH sensors to evaluate acidity, the study extracts the critical features that reflect mango freshness, ripeness, and overall quality. Figure 2 shows that the dataset was analyzed using four machine learning models such as Decision Tree, KNN, Naive Bayes and AutoMLP. Among these, AutoMLP demonstrated superior performance, with an impressive accuracy of 98.46%, underscoring its ability to handle complex feature interactions effectively. The proposed system not only predicts the overall quality of mangoes but also identifies and highlights rotten areas, making it a comprehensive solution for fruit quality assessment. This non-invasive technique ensures the integrity of the fruit, offering significant advantages over traditional destructive methods. With potential applications in agriculture, food processing, and supply chain management, this framework paves the way for sustainable and precise quality evaluation. Future advancements could include integrating additional sensor technologies, such as hyper-spectral imaging, and exploring ensemble and transfer learning models to further enhance performance and adaptability.

7. Conclusions

In conclusion, this research successfully demonstrates a non-destructive method for assessing mango quality using machine learning and sensor-based technologies. By analyzing features such as sweetness, color, and acidity, the system accurately evaluates mango freshness, ripeness, and overall quality while identifying and marking rotten areas. Among the models applied, AutoMLP achieved the highest accuracy of 98.46%, highlighting its robustness in handling complex data. This approach offers a practical, efficient, and sustainable solution for agricultural and food industry applications, eliminating the need for invasive testing. For future work, the integration of advanced sensors like hyperspectral imaging, the exploration of ensemble or transfer learning techniques, and the extension of the framework to other fruits and agricultural products are recommended. Additionally, the development of a real-time mobile or web-based application, coupled with IoT-enabled devices, could facilitate automated and scalable quality monitoring across the supply chain, paving the way for enhanced precision agriculture.

Author Contributions

M.M. was responsible for conceptualization, methodology design, data curation, formal analysis, and drafting the original version of the manuscript, as well as managing overall project administration. M.H. contributed to literature review, validation, preparation of visualizations, and participated in reviewing and editing the manuscript. A.D.W. provided supervision, resources, critical revisions to enhance the quality of the work, and supported funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Olale, K.; Walyambillah, W.; Mohammed, S.A.; Sila, A.; Shepherd, K. Application of DRIFT-FTIR Spectroscopy for Quantitative Prediction of Simple Sugars in Two Local and Two Floridian Mango (Mangifera indica L.) Cultivars in Kenya. J. Anal. Sci. Technol. 2017, 8, 21. [Google Scholar] [CrossRef]
Paiva-Peredo, E.; Morales-Hualla, R.; Gálvez-Porras, I.; Trujillo, W. Non-Destructive Evaluation of Dry Matter in ‘Edward’ Mango by Reflectance Spectroscopy. In Proceedings of the LACCEI International Multi-Conference for Engineering, Education and Technology, Buenos Aires, Argentina, 17–21 July 2023. [Google Scholar] [CrossRef]
Capela, J.; Correia, J.; Pereira, V.; Rocha, M. Development of Deep Learning Approaches to Predict Relationships Between Chemical Structures and Sweetness. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18–23 July 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar] [CrossRef]
Srisungsittisunti, B. Forward Feature Selection for Ensembles to Predict Brix Values in Mango Fruits based on NIR Spectroscopy Technique. Int. J. Sci. 2018, 15, 43–57. [Google Scholar]
Anderson, N.T.; Walsh, K.B. Review: The evolution of chemometrics coupled with near infrared spectroscopy for fruit quality evaluation. J. Near Infrared Spec. 2022, 30, 3–17. [Google Scholar] [CrossRef]
Al-Sanabani, D.G.A.; Solihin, M.I.; Pui, L.P.; Astuti, W.; Ang, C.K.; Hong, L.W. Development of Non-Destructive Mango Assessment Using Handheld Spectroscopy and Machine Learning Regression. J. Phys. Conf. Ser. 2019, 1367, 012030. [Google Scholar] [CrossRef]
Shahzad, M.F.; Xu, S.; Lim, W.M.; Yang, X.; Khan, Q.R. Artificial Intelligence and Social Media on Academic Performance and Mental Well-Being: Student Perceptions of Positive Impact in the Age of Smart Learning. Heliyon 2024, 10, e29523. [Google Scholar] [CrossRef] [PubMed]
Zhang, W.; Xu, M.; Feng, Y.; Mao, Z.; Yan, Z. The Effect of Procrastination on Physical Exercise Among College Students—The Chain Effect of Exercise Commitment and Action Control. Int. J. Ment. Health Promot. 2024, 26, 611–622. [Google Scholar] [CrossRef]
Nguyen, C.N.; Phan, Q.T.; Tran, N.T.; Fukuzawa, M.; Nguyen, P.L.; Nguyen, C.N. Precise Sweetness Grading of Mangoes (Mangifera indica L.) Based on Random Forest Technique with Low-Cost Multispectral Sensors. IEEE Access 2020, 8, 212371–212382. [Google Scholar] [CrossRef]
Ulya, M.; Chamidah, N.; Saifudin, T. Mango Quality Prediction Based on Near-Infrared Spectroscopy Using Multi-Predictor Local Polynomial Regression Modeling. F1000Research 2023, 12, 656. [Google Scholar] [CrossRef]

Figure 1. Rapid miner model.

Figure 2. Model accuracy comparison for mango quality prediction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Muzamal, M.; Hussain, M.; De Wibowo, A. Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms. Eng. Proc. 2025, 107, 116. https://doi.org/10.3390/engproc2025107116

AMA Style

Muzamal M, Hussain M, De Wibowo A. Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms. Engineering Proceedings. 2025; 107(1):116. https://doi.org/10.3390/engproc2025107116

Chicago/Turabian Style

Muzamal, Muhmmad, Manzoor Hussain, and Aryo De Wibowo. 2025. "Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms" Engineering Proceedings 107, no. 1: 116. https://doi.org/10.3390/engproc2025107116

APA Style

Muzamal, M., Hussain, M., & De Wibowo, A. (2025). Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms. Engineering Proceedings, 107(1), 116. https://doi.org/10.3390/engproc2025107116

Article Menu

Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms^†

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Models

3.1.1. K-Nearest Neighbors

3.1.2. Naïve Bayes

3.1.3. Decision Tree

3.1.4. AutoMLP

3.1.5. Neural Network

4. Framework

5. Dataset Description

5.1. Retrieve Mango Data

5.2. Replace Missing Values

5.3. Split Data

5.4. Training

5.5. Apply Model

5.6. Performance Evaluation

6. Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms †

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Models

3.1.1. K-Nearest Neighbors

3.1.2. Naïve Bayes

3.1.3. Decision Tree

3.1.4. AutoMLP

3.1.5. Neural Network

4. Framework

5. Dataset Description

5.1. Retrieve Mango Data

5.2. Replace Missing Values

5.3. Split Data

5.4. Training

5.5. Apply Model

5.6. Performance Evaluation

6. Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Non-Destructive Mango Quality Prediction Using Machine Learning Algorithms^†