Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study

Ozdemir, Serkan

doi:10.3390/horticulturae11030308

Open AccessFeature PaperArticle

Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study

by

Serkan Ozdemir

^1,2

¹

Transport and Planning Department, Delft University of Technology, 2628 CN Delft, The Netherlands

²

Department of Information Systems, Middle East Technical University, Ankara 06800, Türkiye

Horticulturae 2025, 11(3), 308; https://doi.org/10.3390/horticulturae11030308

Submission received: 19 February 2025 / Revised: 9 March 2025 / Accepted: 11 March 2025 / Published: 12 March 2025

(This article belongs to the Special Issue AI-Powered Phenotyping of Horticultural Plants)

Download

Browse Figures

Versions Notes

Abstract

To select a good quality watermelon, one needs the ability and experience to recognize specific patterns in its visual characteristics. As buyers usually cannot taste the watermelon beforehand, the outer patterns of a good quality watermelon may vary depending on the perspective of the purchaser. As a result, there is a gradual adoption of new generative artificial intelligence (AI) tools in the field of horticulture. These tools are expected to minimize bias in human perception when determining the quality of a watermelon based on its outer characteristics. This study aimed to compare the quality of watermelons selected by generative AI with a panel sensory evaluation test. The results of the two case studies indicate a significant difference in the quality of the generative AI-selected watermelons. As an average, watermelon evaluators favored the watermelons selected by ChatGPT as the best based on the Wilcoxon rank sum test and paired t-test (p < 0.05). In conclusion, watermelons can be selected by ChatGPT with minimal effort, promptly meeting consumer expectations.

Keywords:

watermelon; artificial intelligence; ChatGPT; visual assessment; panel test

1. Introduction

Watermelon selection places individuals under pressure due to an inability to taste a slice in advance. Consequently, watermelon quality is often inspected by its outer characteristics, patterns, and audible perceptions. Common intuitive methods include inspecting the strips, color, sound, shape, size, surface defects, and tail characteristics. The patterns derived by using these characteristics are occasionally proven to identify a watermelon’s quality. Nevertheless, tiny details, day-to-day mood, or sensation characteristics may often prevent the selection of a high-quality watermelon.

In the literature, two views were featured to inspect the characteristics of watermelon: the internal quality and external quality. The size, color, texture, and surface flaws are among the attributes used to assess the exterior quality of watermelons [1]. In the meantime, indices, including soluble solids, sugar, acidity, sweetness, and firmness, are frequently used to evaluate internal quality. It is customary to use the visual inspection method with a personal form to detect the exterior qualities, but this approach is expensive, time-consuming, and non-standard [2]. Refractometry, a Potentiometric reference method, high-performance liquid chromatography, and Magness–Taylor penetration [3] are the methods used to measure the soluble solids content (SSC), total acid content (TAC), sugar content, and firmness, respectively. At the moment, these techniques are the accepted way to gauge internal quality. Nevertheless, the conventional methods of detection are localized, labor-intensive, and damaging. According to Menezes Ayres et al. [4], fruit ripeness is the most important all-encompassing quality measure for growers, retailers, and consumers, as it is associated with both internal and external quality traits.

While traditional methods for selecting high-quality watermelons remain common, several researchers have pioneered the use of image analysis techniques in conjunction with these methods. These scholars have applied these techniques in three distinct areas: machine vision (MV), visible/near-infrared spectroscopy (Vis/NIRS), and hyperspectral imaging (HSI).

MV is particularly effective for assessing the exterior quality of watermelons due to its ability to capture rich phenotypic information, such as shape, color, and texture [5]. A typical MV detection system comprises a computer, a light source, a tray, and a camera. The light source can be either strip lighting or bulbs, which are symmetrically positioned around a dark box to minimize shadows. The advantages of MV technology include its low cost, user-friendly operation, and rapid data processing speed, making it an excellent choice for developing an online fruit quality detection system.

A number of researchers have investigated the possible use of Vis/NIRS in the quality detection of watermelons. A composite measure that has raised issues is ripeness. Jie et al. [6] tested a novel method called the peak ratio for the four-class ripeness (unripe, medium-ripe, ripe, and overripe) classification of watermelons, which included the peak intensity ratio and the normalized difference intensity of peaks. They discovered that the Vis/NIR transmission spectra of watermelons with different ripeness had two prominent peaks at around 730 and 803 nm. The best results were obtained with a classification accuracy for prediction (ACC_P) of 88.10% when the peak intensity ratio was optimized using a correction factor based on the categorization boundary. To evaluate the watermelon’s ripeness, Lazim et al. [7] and Vega-Castellote et al. [8] used support vector machine (SVM) and partial least squares discriminant analysis (PLS-DA) models, respectively. The texture attribute is one of the main ripeness assessment indices. In order to maximize feature extraction, Khurnpoon and Sirisomboon [9] conducted a study to identify the texture attributes (initial firmness, rupture force, average firmness, rupture distance, toughness, average penetrating force, and penetrating energy) of netted muskmelons using PLS.

In the 400–1100 nm spectral range, studies employing HSI with a diffuse reflection mode typically provide good performances. Previous research by Ma et al. [10] looked at the use of HSI to measure the SSC and hardness of Hami melons. According to the prediction results, firmness and the SSC had a coefficient of determination for prediction (R²_P)s of 0.42 and 0.50 for the PLS models without spectral feature extraction, respectively. These values were initiatory and not acceptable. The HSI data are massive and include unrelated information. The best wavelengths must be chosen in order to reduce superfluous data and enhance model performance. Sun et al. [11] looked at the PLS to identify the internal quality of Hami melons and the genetic algorithm, successive projections algorithm (SPA), and competitive adaptive reweighted sampling (CARS) for choosing the best wavelengths. The CARS-PLS model performed the best, according to the data, with an R²_P s of 0.92 for the SSC, 0.83 for TAC, and 0.75 for firmness. Furthermore, Jing-tao et al. [12] conducted a study to determine the quality of Hami melons. They found that CARS-SPA-SVM produced the best R²_P values of 0.88 and 0.68 for the SSC and firmness prediction, respectively, and that CARS-PCA-SVM produced a good model with an ACC_P of 94.00% for ripeness.

The majority of research has been on using conventional machine learning models—which have already proven effective in the quality detection of watermelons—to analyze spectra, photos, and acoustic vibration signals. However, the drawback of these approaches is that they rely on human labor and past expertise for feature engineering. Comprehensive deep learning algorithms offer a proficient approach to analyzing the data produced by optical and acoustic vibration sensors, as demonstrated by their effectiveness in some MV and Vis/NIRS investigations. Furthermore, the ability to generalize presents another difficulty for the widespread use of conventional machine learning models in practice. Guo et al. [13] suggested using auto-encoder neural networks in conjunction with the Internet of Things to transfer the NIR model for identifying an apple’s SSC and enhancing model portability. The outcomes offered a point of reference for enhancing the quality detection models for watermelon’s generalization. While deep learning models have been applied to the analysis of acoustic vibration signals from fruit, including oranges [14] and apples [15,16], more research is needed to fully explore the potential of these models for acoustic vibration-based watermelon quality detection. Furthermore, it is mentioned that using deep learning for the quality identification of watermelons still presents challenges due to the need for huge datasets and interpretable models. It is challenging to gather a significant amount of information about the acoustic and optical vibrations of various samples in a single experiment. Data augmentation techniques, like the Imgaug data enhancement library [17] for creating images of Hami melon surface defects and deep convolutional generative adversarial networks [18] for creating shortwave infrared spectra of pesticide residues on the Hami melon’s surface, have been proposed as solutions to this problem. For data-driven deep learning models, it is important to pay more attention to how to guarantee that the generated data are accurate and close to reality.

The selection of watermelons and the determination of certain standards affects both buyers and sellers from different perspectives. Buyers usually judge the quality of the watermelon according to the criteria of juiciness, crispiness, sweetness, and freshness. Therefore, they want to buy the optimal average watermelon that takes all the characteristics into account. Sellers, on the other hand, focus on maximizing their profits and the highest sales volume. From both points of view, it is important to determine the quality of the watermelon more precisely based on various quality-related characteristics without damaging the fruit.

There are numerous non-destructive methods, as mentioned earlier, such as acoustic analysis, optics, X-ray imaging, ultrasonics, near-infrared spectroscopy, Raman spectroscopy, hyperspectral imaging, magnetic resonance imaging, and optical coherence tomography to determine the optimal harvest time for watermelon [19]. However, these laboratory and production line quality estimation methods cannot be applicable by customers to make a real-time decision at the market level.

Therefore, this study aims to test and determine novel, non-destructive, generative artificial intelligence (AI)-based tools for selecting high-quality watermelons by analyzing the photos of customers that were taken at the retail shelf using prompt engineering principles. The following hypotheses were formulated to test whether the intended objectives were met. (1) ChatGPT GPT-4o version can accurately assess the ripeness and quality of watermelons from smartphone photos, and (2) there is a significant correlation between selection by ChatGPT and consumer perception.

The remainder of this paper is structured as follows: Section 2 presents the materials and methods, followed by Section 3, which outlines the results. Section 4 provides a detailed discussion of the findings, and Section 5 concludes the study.

2. Materials and Methods

2.1. Experimental Materials and Selection Methodology

The images were taken with basic smartphones owned by the partners in natural lighting conditions on the shelf as though they were from a usual customer’s point of view. The smartphone model used was the Samsung Galaxy A50 (Seoul, Republic of Korea), and it has 3 cameras on the back side, which have 25 MP, 8 MP, and 5 MP resolutions. The apertures for the cameras are F1.7, F2.2, and F2.2. The first camera was the main camera for the photograph, while the second camera offered an extra wide angle, and the third camera offered depth perception. The photographs were captured in the afternoon under optimal sunlight conditions, ensuring the best illumination to highlight the watermelon’s features. Three pictures were taken per shelf, positioned at the top, middle, and bottom, taking into account the following considerations: capturing pictures during daylight hours, focusing on as many watermelons as possible, avoiding an immediate movement of the camera after capturing the picture, and positioning the camera so that sunlight does not shine directly on it. For this study, images representing a diverse real scenario, ensuring no blurriness or distortion according to the Variance of the Laplacian method, were selected, capturing different positions of the camera, featuring different varieties, using the screen to reveal hidden watermelons, and capturing objects resembling watermelons.

This study used the most pervasive and pioneer generative AI tool, named ChatGPT, in its GPT-4o version. Although it started as merely a chatbot tool, ChatGPT added image and video recognition tools, as well [20]. This study investigates the accuracy of image recognition tools in watermelon cases. There are five steps taken to investigate the quality of watermelons in ChatGPT: photo uploading, image recognition and feature extraction, analysis and evaluation, generative AI interpretation, and user feedback. The flowchart of processes in ChatGPT is shown in Figure 1.

When a photo is uploaded, it starts the image analysis process. This involves recognizing and extracting features from the image, such as the color, shape, size, texture, and patterns. The extracted features are then analyzed based on predefined criteria involving the color, shape, size, texture, and patterns. A generative AI interprets the results, integrates the findings, and generates output in the form of text and a score. Finally, feedback is generated for the user based on the output.

The image recognition model architecture in ChatGPT relies on convolutional neural network (CNN) architecture. The CNN consists of the input layer, convolutional layers, pooling layers, fully connected layers (dense layers), and an output layer. In the input layer, the image is resized (e.g., 224 × 224 pixels) and normalized. On the other hand, the convolutional layer consists of filters that scan the image to detect the edges, colors, and textures. The early convolutional layers detect low-level features like color patches and texture, while deeper convolutional layers detect high-level features like shape, symmetry, and field spots. The pooling layers reduce dimensionality while preserving key features. Fully connected layers classify images into categories as good or bad watermelon by using softmax activation probability scores. Lastly, the output layer produces a classification label as either best or worst and highlights the watermelon in the image. The parameters that affect the model performance are the kernel size, stride, activation functions, number of layers, and dropout rate.

Prompts used in ChatGPT (Figure 2);

-: Select best watermelon. Give coordinate. (Attached photograph);
-: Select worst watermelon. Give coordinate. (Attached photograph).

The AI evaluates the color by analyzing the pixel distributions and hue variations, detecting ripeness indicators, such as deep green rinds or yellow ground spots, and aligning with human assessments, where deeper yellow spots suggest longer ripening periods. Shape and size recognition rely on comparing the geometric properties to identify uniformity and symmetry, as irregular shapes can indicate uneven ripening or internal defects. The texture is inferred through surface pattern recognition, where the generative AI analyzes fine-grained details such as vein networks, surface smoothness, or rough patches using edge detection and contrast analysis. In human sensory evaluation, rough or overly shiny surfaces may be associated with under or overripeness. Pattern recognition enables the model to assess striping intensity and uniformity, which often indicate variety-specific ripeness, with well-defined and evenly distributed stripes suggesting optimal growth conditions.

2.2. Sensory Evaluation

The experts used for the watermelon quality assessment were selected in terms of their experience in farming and selling watermelons in the past. The watermelons, both of high quality and low quality, were selected from photographs taken of the store uploaded to ChatGPT.

The taste panel conducted sensory tests that were performed according to ISO 13299 standards by two different groups: Case Study I and Case Study II consisted of 20 and 39 selected panelists from students and faculty members, and each member of the panel was informed about the experiment and trained according to the standard after they agreed to conduct it [21]. The selected watermelon from the best and least groups was sliced into small pieces, and the panelist assessed the quality of the watermelon based on the crispiness [22], juiciness [23], sweetness [24], and freshness [23] characteristics. The panelists randomly tasted watermelon from two selected groups and were informed of the scoring procedure prior to the test, and all scores were based on the strongest sensation they had ever felt; according to ISO 8589, ratings from 0 to 5 for all attributes indicate an increase in the intensity of consumer perception of the attribute [25].

2.3. Data Analysis

All variable sizes are equal, and there are two groups. This study used Shapiro–Wilk’s W test in order to check the normality of the samples. For those variables that had a normal distribution, a paired t-test was applied; for others, the Wilcoxon signed rank test was conducted. The level of significance was determined as 0.05.

3. Results

Although the fruit quality attributes, such as shape, size, color, and peel pattern, are important to retailers, flesh color, sweetness, crispiness, and refreshment factors are important to consumers. In the current study, the relationship between AI selection and consumer preference was investigated in two experiments to test the significant difference between the best and the least good watermelons selected by ChatGPT.

3.1. Case Study I

The results from the statistical findings for Case Study I are represented in this section. They consist of generative AI results, experts from the watermelon producing or selling process evaluation and comparison, and the significance of the results. The ratings of 20 evaluators on the crispiness, sweetness, juiciness, and freshness of watermelon were selected from the best and least good groups to determine the prospective consumer preferences for watermelon from two groups (Table 1). In the tables, “1” represents the least good watermelon, and “2” represents the best watermelon selected by ChatGPT.

There was a significant difference in the perceived crispiness of watermelon between the two groups, with the watermelon from the best group (3.60) having the highest score for crispiness and the watermelon from the least good group (2.85) receiving the lowest score. Similar to crispiness, the sweetness of the watermelon from the best group was also rated more positively by the testers. Correspondingly, more people rated the watermelon less favorably than the least good group. There was a similar trend in the evaluation, with the best watermelon group scoring highest for juiciness and freshness. In contrast, the watermelons in the least good group scored lowest for juiciness and freshness. The testers scored higher for juiciness for all indicators, with only a few scorings lower.

Table 1 demonstrates the descriptive statistics of the expert opinion evaluations for Case Study I. There were a total of 20 experts in the field of watermelon production or sales. The demographics of the experts were between 25 and 45 years old, and 85% were male. The average values of the variables fluctuate between 2.75 and 4.2. The median values, on the other hand, vary between 2.00 and 4.00.

The Shapiro–Wilk’s W test reveals that the p-values are lower than a threshold of 0.05 for the crisp₁ data, both samples for the sweetness data, and both samples for the juiciness data (Table 2). This indicates that the datasets are not normally distributed. Therefore, a nonparametric test needed to be conducted [26]. In contrast, the samples from the fresh and average data are normally distributed significantly (p-value > 0.05). Hence, these data fall into a parametric test investigation.

In order to check the significant difference between pair samples, the Wilcoxon rank sum test and paired t-test were conducted (Table 3). The tests were selected based on Du Prel et al. [26]’s study, which chooses tests based on normality, two or more groups, and whether they have an equal sample size. The significant difference between crispiness, sweetness, and juiciness was tested using the Wilcoxon rank sum test, and freshness and the average were tested through the paired t-test.

The Wilcoxon rank sum test results reveal that the p-values are below the threshold (0.05) for the “crispiness” and “sweetness” variables, leading to reject the null hypothesis (H0) and indicating a significant difference between the measurements of these two methods used in the experiment. However, the alternative hypothesis (H1) is not rejected, suggesting that the medians of these two distributions are not equal. In contrast, the p-value for the “juiciness” variable is higher than the threshold (0.17), indicating the acceptance of the null hypothesis. This suggests that there is no significant difference between the measurements of these two methods used in the experiment, as the medians of the two distributions are equal.

The paired t-test results reveal that the p-values are above the threshold (0.05) for the freshness variable values. It can be concluded that there is no significant difference in the freshness levels of the watermelons. By contrast, the average values from both samples are significantly different (p-value ≤ 0.05). The results of the average values indicate higher mean values attained from the average values of the best-selected watermelon, which are also significantly different from the least good selected watermelon.

3.2. Case Study II

Case Study II involved students in the field of food engineering and related departments, as well as members of the gastronomy club at the university. A total of 39 people evaluated the quality of watermelons based on four different aspects (Table 4). The participants’ ages ranged from 18 to 30, and 90% of them were male.

Table 4 demonstrates the descriptive statistics of the expert opinion evaluations for Case Study II. The average values of the variables varied between 3.28 and 4.08. The median values, on the other hand, varied between 3.00 and 4.00.

Table 5 presents the p-values of the normality tests for the investigated quality variable. Noticeably, all the variables except the average values are lower than the threshold, indicating that the data are not normally distributed. The average data, on the other hand, are significantly normally distributed. According to Du Prel et al. [26], normally distributed datasets should be tested with the Wilcoxon rank sum test and the datasets that are not normal need to be tested with paired t-tests.

The results in Table 6 were used to test four variables (crispiness, sweetness, juiciness, and freshness) using the Wilcoxon rank sum test. The analysis showed that crispiness and freshness had p-values of less than 0.05, indicating that the samples for these variables are significantly different from each other. However, for the juiciness and freshness samples, the p-values were above 0.05, suggesting that the samples for these variables are not significantly different from each other. Additionally, the paired t-test was used to compare the average values of the samples. The obtained p-value of 0.05 was below the threshold, indicating a significant difference between the average values of the best and least good selected watermelon quality perceived by the evaluator.

4. Discussion

The findings of this experimental study implicate that the watermelon selection on quality characteristics can be guided by novel generative AI tools such as ChatGPT. This study found that the generative AI senses using only outside characteristics such as shape, size, color, texture, and patterns are adequate to decide the inner quality. The experts from two case studies graded, from their perspectives, the crispiness, juiciness, sweetness, and freshness characteristics. The experts’ comparison between the best watermelon and the worst watermelon selected by ChatGPT is also compatible.

On the other hand, since the watermelons are already preselected before being placed on the shelf, the variations are low among the quality levels of watermelons. It is usually expected that shelves are organized to allure potential customers with the best-looking and highest-quality goods in the store. Therefore, the possibility of being exposed to rotten watermelons is quite low in a store. The results also show that the overall grades of best-selected and least good selected watermelons are close to each other. When the tips are taken into account for watermelon selection, such as shape, size, color, and acoustics, the human inspection of all the watermelons gives similar results to each other with minor differences.

Although image processing applications can be used to select the best watermelon, they are advanced solutions that require scientific knowledge. However, customers and sellers often do not have artificial intelligence solutions knowledge and they want to select in the least amount of time. In November 2022, OpenAI published ChatGPT, a large language model (LLM) designed to have a conversation with a user that is human-like. With only human language prompting, ChatGPT can perform remarkable characteristics for a wide range of activities at a previously unheard-of level [27]. Beyond ChatGPT, there has been an active investigation into the possibilities of LLM applications for particular contexts in science, technology, and society, such as agriculture [28] and plant science [29]. Therefore, the integration of generative AI technology into plant science would provide a competitive advantage to watermelon merchants and accurate estimation for purchasers.

The results also indicate that although ChatGPT favored one watermelon, some experts gave low points in terms of their evaluation. This phenomenon was observed in both case studies. However, on average, significantly, the best watermelon was selected by the experts. The experts’ mood could explain this concept regarding their sense of sweetness, juiciness, crispiness, and freshness criterion, as well as their perception of the watermelon’s quality and low visual variations in the watermelons on the store shelf.

Subtle differences in the results between the two experiments could stem from several distinct factors. Sensory perception varied among the evaluators due to individual differences in taste sensitivity, experience, and biases. Since different panels of experts were used in each experiment, their subjective judgments might have influenced the statistical results, leading to discrepancies in the significance levels. Furthermore, the way the tests were conducted—such as the differences in timing, order of evaluation, or even slight variations in instructions—could contribute to inconsistencies in the results. In addition, statistical variability plays a role; while both experiments used the Wilcoxon rank sum and paired t-tests, small differences in sample sizes, distribution characteristics, or the way data were processed could influence whether significance was detected. Another point is the randomness in human perception and response variability, which can introduce inconsistencies, even when following the same methodology, leading to slight but notable differences in the statistical outcomes.

This study was conducted by way of two experiments with 20 and 39 experts, respectively, and the results are limited to their perceptions. Furthermore, the Crimson Sweet watermelon type was used in both experiments in order to ensure consistency. Crimson Sweet is the world’s most widely cultivated and consumed watermelon, prized for its exceptional taste and popularity among growers and consumers alike. The evaluation of watermelon was conducted in only ChatGPT and the watermelon quality results may change in other generative AI tools. In addition, the accuracy of the generative AI outputs is constrained by the quality of the image and the specific type of watermelon that has to be captured in the store. Another limitation is related to AI misjudgments. The model’s selection process is likely influenced by the visual features it was trained to recognize. While AI can detect external characteristics such as color, texture, shape, and surface imperfections, it lacks direct access to internal qualities like sweetness and crispiness, which require sensory input beyond visual analysis. This could lead to instances where AI selects a watermelon that appears ideal but does not meet taste expectations. Additionally, AI relies on pre-trained datasets and patterns from previous images, which may not perfectly align with real-world variabilities in watermelon quality. Factors such as lighting conditions, image resolution, and the angle of photography may also impact AI predictions, leading to occasional misjudgments. In the future, the results of this study can be extended and further compared to other fruits and vegetables.

5. Conclusions

The often tedious and frustrating process of watermelon selection requires a high level of experience and sharp receptors. This is why generative AI tools are increasingly being used in this area. The quality of the watermelon influences sales and retailer confidence. Therefore, it is important to find an effective and objective watermelon selection tool. The statistical results of this study show that there is indeed a significant difference between the best and the worst watermelon selected by ChatGPT and a positive correlation between the quality criteria, such as crispiness, sweetness, juiciness, and freshness. The results of this novel study show that the generative AI learning method is able to recognize the taste expectations of humans for watermelons by uploading a few images to select the best watermelon. Thus, generative AI tools are the next evaluators and helpful assistants for watermelon sellers.

Funding

This research received no external funding.

Data Availability Statement

The datasets presented in this article are not readily available because they were collected under specific conditions that do not allow for unrestricted redistribution. Requests to access the datasets should be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ACC_P	Classification accuracy for prediction
AI	Artificial intelligence
CARS	Competitive adaptive reweighted sampling
HSI	Hyperspectral imaging
LLM	Large language model
MV	Machine vision
PLS-DA	Partial least squares discriminant analysis
R²_P	Coefficient of determination for prediction
SPA	Successive projections algorithm
SSC	Soluble solids content
SVM	Support vector machine
TAC	Total acid content
Vis/NIRS	Visible/near-infrared spectroscopy

References

Ali, M.M.; Hashim, N.; Bejo, S.K.; Shamsudin, R. Rapid and nondestructive techniques for internal and external quality evaluation of watermelons: A review. Sci. Hortic. 2017, 225, 689–699. [Google Scholar] [CrossRef]
Shi, Y.; Guo, J.; Li, X.; Guo, Y.; Liu, Y.; Zhou, J. Research status and development trend of non-destructive testing technology and grading equipment for Hami melon. Packag. Food Mach. 2021, 39, 75–82. [Google Scholar]
Imanpanah, H.; Kasraei, M.; Raoufat, M.H.; Nejadi, J. Development and evaluation of a portable apparatus for bioyield detection: A case study with apple and peach fruits. Int. J. Food Prop. 2015, 18, 1434–1445. [Google Scholar] [CrossRef]
Ayres, E.M.M.; Lee, S.M.; Boyden, L.; Guinard, J. Sensory properties and consumer acceptance of cantaloupe melon cultivars. J. Food Sci. 2019, 84, 2278–2288. [Google Scholar] [CrossRef] [PubMed]
Vorobyev, G.; Subbotin, D.A.; Losevskaya, E. Watermelon Quality Determining from a Photo Using Machine Vision. In Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus), St. Petersburg, Moscow, Russia, 26–28 January 2021; pp. 2286–2289. [Google Scholar] [CrossRef]
Jie, D.; Zhou, W.; Wei, X. Nondestructive detection of maturity of watermelon by spectral characteristic using NIR diffuse transmittance technique. Sci. Hortic. 2019, 257, 108718. [Google Scholar] [CrossRef]
Lazim, S.S.R.; Nawi, N.M.; Bejo, S.K.; Shariff, A.R.M.; Abdullah, N. Prediction and classification of soluble solid contents to determine the maturity level of watermelon using visible and shortwave near infrared spectroscopy. Int. Food Res. J. 2022, 29, 1372–1379. [Google Scholar] [CrossRef]
Vega-Castellote, M.; Sánchez, M.-T.; Wold, J.P.; Afseth, N.K.; Pérez-Marín, D. Near infrared light penetration in watermelon related to internal quality evaluation. Postharvest Biol. Technol. 2023, 204, 112477. [Google Scholar] [CrossRef]
Khurnpoon, L.; Sirisomboon, P. Rapid evaluation of the texture properties of melon (Cucumis melo L. Var. reticulata cv. Green net) using near infrared spectroscopy. J. Texture Stud. 2018, 49, 387–394. [Google Scholar] [CrossRef]
Ma, B.-X.; Xiao, W.-D.; Qi, X.-X.; He, Q.-H.; Li, F.-X. Nondestructive measurement of sugar content of Hami melon based on diffuse reflectance hyperspectral imaging technique. Spectrosc. Spectr. Anal. 2012, 32, 3093–3097. [Google Scholar] [CrossRef]
Sun, J.; Ma, B.; Dong, J.; Zhu, R.; Zhang, R.; Jiang, W. Detection of internal qualities of hami melons using hyperspectral imaging technology based on variable selection algorithms. J. Food Process Eng. 2017, 40, e12496. [Google Scholar] [CrossRef]
Sun, J.; Ben-xue, M.; Juan, D.; Jie, Y.; Jie, X.; Wei, J.; Gao, Z. Study on maturity discrimination of Hami melon with hyperspectral imaging technology combined with characteristic wavelengths selection methods and SVM. Spectrosc. Spectr. Anal. 2017, 37, 2184–2191. [Google Scholar] [CrossRef]
Guo, Z.; Zhang, Y.; Wang, J.; Liu, Y.; Jayan, H.; El-Seedi, H.R.; Alzamora, S.M.; Gómez, P.L.; Zou, X. Detection model transfer of apple soluble solids content based on NIR spectroscopy and deep learning. Comput. Electron. Agric. 2023, 212, 108127. [Google Scholar] [CrossRef]
Nan, C.; Zhi, L.; Dexiang, L.; Qingrong, L.; Bingnian, J.; Bin, L.; Jian, W.; Yunfeng, S.; Yande, L. Detection of jelly orange granulation disease using a dual-input Resnet-Transformer model (DresT) based on acoustic vibration images and a novel acoustic vibration device. J. Food Compos. Anal. 2024, 132, 106337. [Google Scholar] [CrossRef]
Chen, N.; Liu, Z.; Le, D.; Lai, Q.; Jiang, B.; Li, B.; Wu, J.; Song, Y.; Liu, Y. Acoustic vibration multi-domain images vision transformer (AVMDI-ViT) to the detection of moldy apple core: Using a novel device based on micro-LDV and resonance speaker. Postharvest Biol. Technol. 2024, 211, 112838. [Google Scholar] [CrossRef]
Zhao, K.; Zha, Z.; Li, H.; Wu, J. Early detection of moldy apple core using symmetrized dot pattern images of vibro-acoustic signals. Trans. Chin. Soc. Agric. Eng. 2021, 37, 290–298. [Google Scholar] [CrossRef]
Li, X.; Ma, B.; Yu, G.; Chen, J.; Li, Y.; Li, C. Surface defect detection of Hami melon using deep learning and image processing. Trans. Chin. Soc. Agric. Eng. 2021, 37, 223–232. [Google Scholar] [CrossRef]
Tan, H.; Hu, Y.; Ma, B.; Yu, G.; Li, Y. An improved DCGAN model: Data augmentation of hyperspectral image for identification pesticide residues of Hami melon. Food Control 2024, 157, 110168. [Google Scholar] [CrossRef]
Ibrahim, A.; Daood, H.G.; Égei, M.; Takács, S.; Helyes, L. A Comparative Study between Vis/NIR Spectroradiometer and NIR Spectroscopy for the Non-Destructive Quality Assay of Different Watermelon Cultivars. Horticulturae 2022, 8, 509. [Google Scholar] [CrossRef]
Hassanpour, A.; Kowsari, Y.; Shahreza, H.O.; Yang, B.; Marcel, S. ChatGPT and biometrics: An assessment of face recognition, gender detection, and age estimation capabilities. In Proceedings of the 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 27–30 October 2024; pp. 3224–3229. [Google Scholar] [CrossRef]
ISO 13299; Sensory Analysis—Methodology—General Guidance for Establishing a Sensory Profile. ISO: Geneva, Switzerland, 2003.
Tunick, M.H.; Onwulata, C.I.; Thomas, A.E.; Phillips, J.G.; Mukhopadhyay, S.; Sheen, S.; Liu, C.-K.; Latona, N.; Pimentel, M.R.; Cooke, P.H. Critical evaluation of crispy and crunchy textures: A review. Int. J. Food Prop. 2013, 16, 949–963. [Google Scholar] [CrossRef]
Liu, Y.; Keefer, H.; Watson, M.; Drake, M. Consumer perception of whole watermelons. J. Food Sci. 2024, 89, 625–639. [Google Scholar] [CrossRef]
Mashilo, J.; Shimelis, H.; Ngwepe, R.M.; Thungo, Z. Genetic analysis of fruit quality traits in sweet watermelon (Citrullus lanatus var. lanatus): A review. Front. Plant Sci. 2022, 13, 834696. [Google Scholar] [CrossRef] [PubMed]
ISO 8589; Sensory Analysis: General Guidance for the Design of Test Rooms. ISO: Geneva, Switzerland, 2007.
du Prel, J.-B.; Röhrig, B.; Hommel, G.; Blettner, M. Choosing statistical tests: Part 12 of a series on evaluation of scientific publications. Dtsch. Ärzteblatt Int. 2010, 107, 343. [Google Scholar] [CrossRef]
Zhang, H.; Shao, H. Exploring the Latest Applications of OpenAI and ChatGPT: An In-Depth Survey. CMES Comput. Model. Eng. Sci. 2024, 138, 2061–2102. [Google Scholar] [CrossRef]
Tzachor, A.; Devare, M.; Richards, C.; Pypers, P.; Ghosh, A.; Koo, J.; Johal, S.; King, B. Author Correction: Large language models and agricultural extension services. Nat. Food 2023, 4, 1112. [Google Scholar] [CrossRef]
Yang, X.; Gao, J.; Xue, W.; Alexandersson, E. Pllama: An open-source large language model for plant science. arXiv 2024, arXiv:2401.01600. [Google Scholar]

Figure 1. ChatGPT flowchart.

Figure 2. ChatGPT prompts and image analysis for the best watermelon selection.

Table 1. Descriptive statistics for expert responses for Case Study I.

	Crisp ₁	Crisp ₂	Sweet ₁	Sweet ₂	Juicy ₁	Juicy ₂	Fresh ₁	Fresh ₂	Avg ₁	Avg ₂
Size	20	20	20	20	20	20	20	20	20	20
Mean	2.85	3.60	2.75	3.60	3.75	4.20	2.75	3.25	3.03	3.66
Std	1.18	1.10	1.21	0.99	1.07	0.77	1.25	1.16	1.07	0.86
Min	1.00	2.00	1.00	2.00	1.00	3.00	1.00	1.00	1.00	2.25
25%	2.00	3.00	2.00	3.00	3.00	4.00	2.00	2.00	2.44	2.94
50%	3.00	4.00	2.00	4.00	4.00	4.00	2.50	3.00	2.75	3.88
75%	3.25	4.25	4.00	4.00	4.25	5.00	4.00	4.00	3.81	4.31
Max	5.00	5.00	5.00	5.00	5.00	5.00	5.00	5.00	5.00	4.75

Table 2. Normality test for Case Study I (Shapiro–Wilk’s W test).

	Shapiro–Wilk’s W Test
Crispiness ₁	0.01
Crispiness ₂	0.09
Sweetness ₁	0.00
Sweetness ₂	0.01
Juiciness ₁	0.00
Juiciness ₂	0.01
Freshness ₁	0.09
Freshness ₂	0.06
Average ₁	0.06
Average ₂	0.55

Table 3. Significant difference test results for Case Study I.

	Wilcoxon Rank Sum Test	Paired t-Test
Crispiness	0.03	N/A
Sweetness	0.03	N/A
Juiciness	0.17	N/A
Freshness	N/A	0.20
Average	N/A	0.04

Table 4. Descriptive statistics for the expert responses for Case Study II.

	Crisp ₁	Crisp ₂	Sweet ₁	Sweet ₂	Juicy ₁	Juicy ₂	Fresh ₁	Fresh ₂	Avg ₁	Avg ₂
Size	39	39	39	39	39	39	39	39	39	39
Mean	3.28	4.05	3.79	3.97	3.97	4.08	3.49	4.08	3.63	4.04
Std	1.07	0.65	0.89	0.78	0.78	0.93	1.23	0.74	0.80	0.55
Min	1.00	3.00	2.00	3.00	3.00	2.00	1.00	2.00	1.75	2.75
25%	3.00	4.00	3.00	3.00	3.00	3.00	3.00	4.00	3.25	3.75
50%	3.00	4.00	4.00	4.00	4.00	4.00	4.00	4.00	3.75	4.00
75%	4.00	4.00	4.00	5.00	5.00	5.00	4.00	5.00	4.13	4.50
Max	5.00	5.00	5.00	5.00	5.00	5.00	5.00	5.00	5.00	5.00

Table 5. Normality test for Case Study II (Shapiro–Wilk’s W test).

	Shapiro–Wilk’s W Test
Crispiness ₁	0.00
Crispiness ₂	0.00
Sweetness ₁	0.00
Sweetness ₂	0.00
Juiciness ₁	0.00
Juiciness ₂	0.00
Freshness ₁	0.00
Freshness ₂	0.00
Average ₁	0.10
Average ₂	0.39

Table 6. Significant difference test results for Case Study II.

	Wilcoxon Rank Sum Test	Paired t-Test
Crispiness	0.00	N/A
Sweetness	0.32	N/A
Juiciness	0.46	N/A
Freshness	0.00	N/A
Average	N/A	0.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ozdemir, S. Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study. Horticulturae 2025, 11, 308. https://doi.org/10.3390/horticulturae11030308

AMA Style

Ozdemir S. Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study. Horticulturae. 2025; 11(3):308. https://doi.org/10.3390/horticulturae11030308

Chicago/Turabian Style

Ozdemir, Serkan. 2025. "Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study" Horticulturae 11, no. 3: 308. https://doi.org/10.3390/horticulturae11030308

APA Style

Ozdemir, S. (2025). Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study. Horticulturae, 11(3), 308. https://doi.org/10.3390/horticulturae11030308

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Effectiveness of Generative AI Tool to Determine Fruit Quality: Watermelon Case Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Materials and Selection Methodology

2.2. Sensory Evaluation

2.3. Data Analysis

3. Results

3.1. Case Study I

3.2. Case Study II

4. Discussion

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI