1. Introduction
Mangoes (
Mangifera indica L.) are among the most economically important tropical fruits, widely cultivated and exported across Asia, with Thailand one of the leading producers [
1]. The diverse cultivar, Namdokmai Sithong, is prized for its sweet flavor, low fiber content, and visually appealing appearance. The fruit’s external attributes such as shape, size, and skin color are critical determinants of consumer preference, market price, and compliance with export standards. Consistency in these qualities is essential to maintaining competitiveness in domestic and international markets [
2]. Premium grade Namdokmai Sithong mangoes that meet stringent export requirements are shipped to high-value markets such as Japan, South Korea, and China while lower-grade fruits are sold domestically at reduced prices [
3]. A distinguishing feature of this cultivar is its lack of chlorophyll at maturity, which results in a bright yellow skin but this uniform coloration can obscure subtle differences in maturity and internal quality across batches. Consistency in firmness, skin tone, and sweetness measured by total soluble solids (TSS) is closely tied to consumer satisfaction and export success [
2]. Therefore, accurate and consistent assessment of morphological traits is essential for effective quality grading to optimize supply chain operations.
Traditional approaches to fruit morphology assessment predominantly involve manual measurements and visual inspection, processes that are inherently time-consuming, labor-intensive, and prone to human error and subjective judgment [
4]. The inconsistency and inefficiency associated with manual grading have driven the advancement of automated systems that leverage image processing and machine learning techniques. These modern approaches offer a more accurate, objective, and scalable solution for evaluating fruit quality across diverse parameters [
4,
5].
With the rapid advancement of digital technologies, the agricultural sector has increasingly adopted computer vision and artificial intelligence to enhance productivity, accuracy, and automation. Previous studies have explored monocular depth estimation which allowed volume of food to be estimated from a single image [
6,
7]. Image-based analysis of mangoes has now become a vital fruit grading tool to evaluate physical features such as size, shape, and surface texture. Extracting these features from mango images is a critical step in automated grading and quality assessment. Image processing techniques measure geometric attributes including length, width, thickness, area, perimeter, roundness, and volume offering quantifiable indicators for fruit quality classification. Each geometric feature contributes uniquely to automated mango grading and sorting. Length and width are commonly used for estimating fruit size and approximating volume, with reported accuracies between 96% and 98% [
8], while area and perimeter play a significant role in mass prediction and size classification, achieving accuracies of 97% and 79%, respectively [
9]. Roundness descriptors assist in identifying irregular or misshapen fruits, with this feature typically yielding lower accuracy at around 36% [
10]. Both 2D and 3D modeling approaches demonstrated high precision for volume estimation reaching up to 96% accuracy [
11]. To improve grading consistency, geometric features are often integrated with color and texture information in automated systems [
12]. Machine learning models that combine multiple image features have been shown to enhance classification accuracy and robustness across diverse mango cultivars and growing conditions [
13,
14,
15,
16].
Recent developments in deep learning, particularly in the YOLO (You Only Look Once) object detection framework, have significantly improved the precision and speed of fruit segmentation and feature extraction. YOLO models have been successfully used to first detect mango fruits and then estimate geometric traits such as size, volume, and mass using bounding box dimensions [
17,
18]. Studies using YOLOv7 and YOLOv8 reported high accuracies, achieving mAP@0.5 scores of up to 99.5%, highly suitable for real-time mango detection and field deployment [
11,
18,
19]. Previous research primarily focused on detecting whole fruits but YOLO’s multi-class detection capabilities are well-suited to simultaneously identify the fruit and peduncle when utilizing proper annotations [
20]. YOLO supports a more granular analysis, with the calculation of accurate morphological traits such as length, width, and height, based on the spatial relationship between detected regions. Significant progress has been made in applying computer vision and YOLO techniques to improve automated mango classification, with most studies focusing on detecting whole fruits to estimate size, mass, or external quality parameters. However, these approaches often overlook the inherent asymmetry of mangoes, particularly in shape-based measurements. Previous methods typically used the centroid or bounding box center of the fruit as the origin for axis-based feature extraction, which may not align with the natural anatomical structure, while few studies incorporated the peduncle and fruit in the segmentation process to enhance geometric accuracy.
Object detection and image processing are now widely applied to fruit measurement but most existing methods focus solely on detecting the entire fruit, with simplistic geometric assumptions made to calculate length and width. This study addressed these limitations by introducing a YOLOv8-based detection framework to simultaneously identify the mango fruit and its peduncle, integrating image processing techniques of edge detection, segmentation, and morphological operations. The spatial relationships between the detected regions were leveraged to enhance accuracy of the anatomical measurements of the major axis (fruit length) and the top and bottom minor axes (widths). This adaptive approach enhanced geometric precision and improved the reliability of automated grading and classification systems for commercial mango quality assessment.
2. Methodology
The proposed methodology for automated physical feature extraction of Namdokmai Sithong mangoes consisted of a sequential pipeline integrating object detection, image processing, and geometric analysis, as illustrated in
Figure 1. The process began with manual annotation, where domain experts annotated the mango fruit and peduncle using bounding boxes. These annotations formed the foundation for the data preparation stage, with the dataset split into training (224 images), validation, and test sets. The prepared dataset was then used to train the YOLOv8 object detection model, selected for its speed and accuracy in real-time agricultural applications. Once trained, YOLOv8 detected the fruit and peduncle as new input images. The overlapping region between the two bounding boxes was used to identify a reliable anatomical reference point to measure fruit geometry. Next, the detected fruit regions underwent image processing including object segmentation, morphological operations, and edge detection to isolate clean fruit contours and derive geometric measurements such as the major axis (fruit length) and two minor axes (top and bottom widths) based on the spatial relationship from the detected starting point. In the value extraction stage, the computed measurements were exported into structured formats for analysis. The results were then evaluated by comparison with human-annotated ground truth measurements, with percentage errors calculated to verify the performance. Finally, the complete pipeline was compiled into a standalone executable program, enabling users to perform feature extraction in a user-friendly environment without requiring deep technical expertise. This tool facilitated efficient mango quality assessment and supported future deployment in commercial grading systems.
2.1. Dataset Preparation and Annotation
The dataset was created by capturing images of Namdokmai Sithong mangoes under controlled laboratory conditions. Each image was manually annotated by drawing bounding boxes around the mango fruit and peduncle using rectangular shapes. Annotations were saved in the YOLO format, which is compatible with YOLOv8 training. A total of 244 images were annotated and divided into three subsets as 224 images for training, 10 for validation, and 10 for testing using bounding boxes. The boundary of the fruit part of the mango was defined in the blue bounding box, with the peduncle part of the mango in the green bounding box, as shown in
Figure 2 and
Figure 3.
2.2. Object Detection with YOLOv8
YOLOv8—a state-of-the-art one-stage object detection framework developed by Ultralytics was employed to detect the mango fruit and its peduncle. YOLOv8 simultaneously performs classification and localization in a single forward pass, offering an optimal balance between speed and accuracy, and is particularly well-suited for real-time applications in agricultural environments [
21]. There are five available YOLOv8 model variants (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, and YOLOv8x). In this study, the nano version (YOLOv8n) was selected for its computational efficiency, fast inference speed, and sufficient accuracy for the requirements of this task [
22,
23]. YOLOv8 comprises four core components: the Input layer, Backbone (for hierarchical feature extraction), Neck (for multi-scale feature aggregation), and Head (for final detection outputs). Structural elements such as Conv, C2f, SPPF, and Detect blocks are integrated within these modules, as depicted in
Figure 4. The model was trained using a custom dataset of Namdokmai Sithong mango images, manually annotated with bounding boxes for the fruit and the peduncle. After training, YOLOv8n demonstrated high precision in detecting these regions. The overlapping area between the detected fruit and peduncle bounding boxes was computed and used to estimate the stem-end location. This anatomically meaningful point served as the origin for projecting the major axis (fruit length) and the two minor axes (top and bottom widths), enabling consistent geometric feature extraction for downstream processing.
2.3. Image Processing and Feature Extraction
This stage focused on preparing a clean, segmented representation of the mango for accurate geometric feature extraction. The pipeline included object segmentation, morphological operations, and edge detection. The mango fruits were initially segmented by thresholding techniques using global, adaptive, and Otsu’s thresholding [
24]. These methods performed inconsistently due to the high color variation in the mango skin and dependence on grayscale conversions. To address this, a range-based segmentation approach was adopted in the HSV (Hue, Saturation, Value) color space, which represents color in a way that aligns more closely with human perception. Fruit color varies by maturity and segmentation focused on a white background yielded more stable and reproducible results. Following segmentation, morphological operations, including opening and closing were applied to clean the binary mask. These steps removed small noise artifacts and filled the gaps, producing a smooth and continuous fruit boundary.
2.4. Object Detection
The YOLOv8 object detection model was employed to automate the identification of a reliable anatomical reference point and detect the mango fruit and its peduncle. The model was trained on a custom-annotated dataset containing bounding boxes for both objects. YOLOv8 generated bounding boxes for each region in the image. The centroid of the overlapping area between these boxes was computed and defined as the starting point for geometric measurement. From this starting point, candidate lines were projected at 0.25-degree intervals in all directions until they intersected the boundary of the mango fruit. The longest line was selected as the major axis, representing the length of the fruit (
Figure 5). Once the major axis was established, minor axes were determined by iteratively projecting perpendicular lines from the midpoint of the major axis. The longest lines intersecting the mango boundary above and below this midpoint were selected as the top and bottom minor axes, representing the width of the fruit. This process is illustrated in
Figure 6.
All computations, from detection to axis projection and measurement, were fully automated using Python. The extracted features, including axis coordinates and lengths, were exported to a .csv file, making them readily available for further analysis or integration into automated grading systems.
2.5. Evaluation and Program Development
To assess the accuracy of the feature extraction process, the output values (major and minor axis lengths and coordinates) were compared with expert-annotated ground truth data. The annotations were generated using ImageJ (version 1.54p), an open-source image analysis software commonly used in biomedical and agricultural research for its precision and ease of use [
25]. The error rate between the automated measurements and expert annotations was calculated. When an acceptable threshold (below 5%) was achieved, the system was deemed reliable for practical use.
Next, a graphical user interface (GUI) was developed to streamline the feature extraction process for non-technical users. The application was then compiled into an executable program to facilitate ease of use and deployment in research and production environments.
3. Experimental Results and Discussion
Experiments were conducted to identify the most effective approach for automated data extraction of Namdokmai Sithong mango fruit features. These experiments focused on testing various image preprocessing methods, edge detection algorithms, and object detection strategies. The ultimate goal was to develop a robust pipeline capable of accurately extracting the geometric features of the mango fruit with minimal human intervention.
3.1. Edge Detection Techniques
Edge detection is a fundamental step in the image processing pipeline, particularly for isolating object boundaries. This study analyzed the mango fruit geometry and the texture and surface-level details were considered unnecessary and potentially distracting. The edge detection method is used to prioritize strong, continuous external boundaries while minimizing noise from internal features. Four edge detection algorithms were evaluated including Canny, Holistically nested edge detection, Roberts cross, and the Sobel operator, and their performances were compared using identical input images under consistent conditions.
The Canny edge detection method, shown in
Figure 7, yielded sharp and continuous outlines of the mango perimeter, even in the presence of minor blurring or noise. Canny detection was chosen for its balance between sensitivity and specificity in capturing only the necessary boundary features. Small artifacts visible in the figure such as blemishes on the mango surface were also detected, introducing minor noise that required additional filtering.
Holistically nested edge detection (HED), as shown in
Figure 8, produced clean boundary detection results with minimal internal detail. However, the HED model was computationally more expensive and lacked the sharpness of the Canny outputs. The HED model performance was acceptable but edge continuity was occasionally inconsistent, particularly near the stem area.
The result of the Roberts operator is presented in
Figure 9. This provided some level of edge detection but the output lacked sufficient contrast and resolution, making it difficult to delineate the full mango contour reliably. The results were especially poor around curved regions, which are critical for axis estimation.
Figure 10 illustrates the Sobel operator output. The outer contour was somewhat visible but the result included excessive internal gradients caused by illumination variations, leading to considerable noise that interfered with the geometric analysis. Therefore, this operator was deemed unsuitable.
Based on these comparative results, the Canny edge detector was selected for further experiments. Some limitations were found in noise sensitivity but its effectiveness in capturing well-defined mango contours outweighed the drawbacks. To mitigate these imperfections, subsequent image processing steps such as background masking and morphological filtering were applied before major axis and feature extraction.
3.2. Image Segmentation Methods
Several image segmentation techniques were also evaluated to achieve accurate fruit boundary detection and determine the most effective method for isolating the mango region from its background. The goal was to produce a clean and consistent binary mask for subsequent geometric feature extraction. Four commonly used thresholding approaches were tested including global thresholding, Otsu’s method, color-based segmentation, and background-based range thresholding.
Global thresholding, as shown in
Figure 11, was the most basic technique employed. This method requires manual selection of a fixed intensity threshold to differentiate the foreground from the background. However, due to its dependency on grayscale input and sensitivity to lighting variations, the results lacked consistency across samples with different exposure levels and mango coloration.
Otsu’s thresholding (
Figure 12) automatically determines an optimal global threshold by maximizing between-class variance, producing more consistent segmentation than manual thresholding. Nonetheless, its effectiveness was limited by grayscale constraints and the variable surface reflectance of the mangoes, causing over-segmentation in bright regions.
Color-based thresholding in the HSV color space was examined to overcome these grayscale segmentation limitations.
Figure 13 illustrates the segmentation results based on mango skin tone. This approach offered better discrimination by leveraging hue and saturation values but the inherent variability in mango coloration—ranging from pale yellow to rich golden—made it difficult to define a robust threshold range applicable across samples.
Therefore, the most reliable approach was determined as range-based thresholding by focusing on background colors, specifically as a white backdrop used in image acquisition. This method consistently produced high-quality masks with minimal noise (
Figure 14).
Among the four evaluated segmentation techniques evaluated (global thresholding, Otsu’s method, color-based segmentation, and background-based range thresholding), the background-based method showed the highest robustness and consistency. This approach produced cleaner segmentation masks by leveraging the white background used during image acquisition, with minimal artifacts, even under variable lighting conditions and mango surface textures. Therefore, background-based range thresholding was selected as the default segmentation strategy in the final image processing pipeline due to its effectiveness in isolating the mango fruit boundary and supporting accurate morphological feature extraction.
3.3. Morphological Processing
Following threshold-based segmentation, morphological operations were applied to refine the resulting binary masks. As shown in
Figure 15, segmentation outputs often contained imperfections such as small holes, noise artifacts, or incomplete object regions, especially when the mango surface intensity varied across the fruit body. To address this, a combination of morphological opening and closing was performed. Opening helped to remove small noise components erroneously classified as foreground, while closing filled the gaps and smoothed the object’s edge contour. This step was essential to produce a clean, continuous fruit boundary suitable for geometric feature extraction and axis computation.
3.4. YOLOv8-Based Object Detection for Geometric Analysis
Accurate identification of the starting point is crucial for extracting reliable geometric features from mango images, particularly the major axis (fruit length) and the two minor axes (top and bottom widths). In the early phase of this study, two preliminary methods as manual selection and centroid-based selection were explored.
The first method involved manually marking the stem-end of the mango as the starting point (
Figure 16). While intuitive, this approach was subjective and impractical for large-scale applications due to its reliance on human input. The second method used the centroid of the segmented fruit mask (
Figure 17), offering a fully automated alternative. However, this technique assumed symmetrical fruit shapes, which often misaligned with the true anatomical structure of Namdokmai Sithong mangoes—especially at the stem-end—resulting in inaccurate axis construction.
To address these limitations, an automated method using YOLOv8 object detection was developed and trained to simultaneously detect two classes as the mango fruit and its peduncle. After detection, the bounding boxes of these regions were used to calculate their overlapping area, and the centroid of this overlap was defined as the starting point for geometric measurements (
Figure 18). This anatomically meaningful point was then used to project radial lines at 0.25-degree intervals across the boundary of the fruit, with the longest intersecting line selected as the major axis. Perpendicular lines were iteratively projected from the midpoint of this axis to determine the top and bottom minor axes, representing the maximum width of the fruit.
Among the three methods tested, the YOLOv8-based approach yielded the most accurate and consistent results, offering full automation with minimized human bias and aligning closely with the true fruit anatomy. This method was adopted as the core technique for downstream physical feature extraction.
3.5. Results and Discussion
Experiments were conducted to evaluate the performance of the developed pipeline for automated geometric feature extraction using 30 test images of Namdokmai Sithong mangoes. The proposed application integrated several components including YOLOv8-based object detection, HSV-based image segmentation, morphological filtering, and Canny edge detection in a fully automated Python program. The output measurements for the major axis (fruit length) and the two minor axes (top and bottom widths) were saved in .csv format for downstream analysis.
YOLOv8 was used to detect the mango body and its peduncle. The centroid of the overlapping region between their bounding boxes was defined as an anatomically meaningful starting point from which radial lines were projected to identify the longest path representing the major axis, while image processing techniques were applied to refine the boundary of the fruit. Perpendicular lines from the midpoint of the major axis were then used to determine the top and bottom widths.
Table 1 presents a comparative analysis between human-labeled (H) and computer-generated (C) measurements for the major and minor axes, along with their percentage errors. Human annotations were performed by experts using ImageJ software to ensure high-precision, pixel-level ground truth data, and the automated feature extraction pipeline consistently achieved high accuracy. Errors in major axis measurements were generally below 1.5%, while higher deviations were observed for the minor axes. These differences were more evident in images with segmentation noise or irregular fruit shapes. Sample 17 exhibited the highest deviations across all three measurements (Longest: 4.87%, Top: 4.12%, Bottom: 4.60%), likely due to an unusual fruit shape or misaligned bounding boxes. By contrast, Samples 5, 6, and 9 showed near-zero errors, highlighting the system’s reliability under optimal imaging conditions.
Over 90% of the tested samples demonstrated an error margin within 5% across all axes, confirming the robustness, repeatability, and real-world applicability of our proposed approach, and supporting the system’s potential for use in automated fruit grading work-flows requiring high throughput with minimal human input.
Figure 19 illustrates the application interface displaying the measurement results with clearly marked geometric features, reinforcing its suitability for real-time or offline mango quality assessment tasks. Additional quantitative error analysis for 30 test images, including percentage error metrics and outlier identification, has been conducted and is presented in
Table S1. This analysis reinforces the high accuracy and robustness of the proposed method for both major and minor axis measurements.
3.6. Limitations and Future Studies
The proposed pipeline combining YOLOv8 object detection with HSV-based segmentation, morphological operations, and Canny edge detection demonstrated reliable performance for geometric feature extraction but several limitations remained. First, the pipeline was optimized for images captured against a controlled white background. This setting enhanced segmentation reliability but the model faced challenges in field environments where lighting conditions, background textures, and occlusions varied significantly. Future studies should focus on improving robustness under real-world conditions, potentially through background augmentation or domain adaptation techniques. Second, the automated measurements of the major and minor axes achieved high accuracy, with over 90% of results falling within a 5% error margin but deviations were observed in some cases, particularly with irregular fruit shapes or bounding box misalignment at the peduncle region (e.g., Sample 17). Future improvements should integrate shape-aware models or deformable contour fitting to better accommodate these morphological variations. Third, this system was limited to geometric features (length and width), with other critical quality attributes such as skin texture, color uniformity, or visible defects not considered. Expanding the system to include additional features possibly through multispectral imaging or deep learning-based classification would further improve grading accuracy and versatility. Fourth, to convert the current pixel-based measurements into real-world physical units such as centimeters, prior calibration using known camera parameters and a fixed camera-to-object distance is required. However, variations in these parameters can significantly affect measurement accuracy. To address this, future work should consider integrating depth sensors, reference scaling markers, or camera calibration techniques to ensure reliable dimensional estimation. Additionally, a detailed statistical summary of measurement errors is provided in
Table S1, which shows that average errors remain below 1.5% across all axes. These results further confirm the pipeline’s consistency, although challenges remain for certain irregular fruit shapes and cases involving bounding box misalignment. Lastly, the current version was implemented as a standalone desktop application. Future developments should focus on a cloud-enabled or mobile application, allowing real-time deployment, streamlined data management, and wider accessibility for agricultural stakeholders including farmers, exporters, and inspectors. Currently, the system is tailored to Namdokmai Sithong mangoes, and its generalizability to other cultivars has not yet been evaluated. To enhance the robustness and applicability of the proposed method in broader agricultural contexts, future research should explore cross-cultivar validation using diverse mango varieties.
4. Conclusions
This study proposed a fully automated framework for geometric feature extraction of Namdokmai Sithong mangoes by integrating YOLOv8-based object detection with image processing techniques including HSV-based segmentation, morphological filtering, and Canny edge detection to accurately measure key physical traits, specifically the major (length) and minor (width) axes, without human intervention. Three methods for identifying the anatomical starting point were explored as manual selection, centroid-based approximation, and object detection. Experimental comparisons showed that the YOLOv8-based approach delivered the most consistent and anatomically accurate results. By leveraging the intersection of the mango and peduncle bounding boxes, the system established a reliable starting point for projecting measurement axes. The model was validated on 30 mango samples and the automatically extracted features were compared with expert annotations. Results demonstrated good agreement, with over 90% of the samples achieving error margins within 5%. The average error for the major axis was below 1.5%, while higher deviations in minor axis measurements were attributed to segmentation artifacts or shape irregularities. Our proposed pipeline offers a robust, scalable, and user-friendly solution for non-destructive fruit quality assessment, with the potential for broader agricultural applications including extension to other fruit varieties and real-time deployment in commercial grading environments.