Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning

Zhao, Lei; Yang, Zhou; Wang, Chunxia; Jin, Mohui; Duan, Jieli

doi:10.3390/agronomy15081880

Open AccessArticle

Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning

by

Lei Zhao

^1,2

,

Zhou Yang

^2,3,*,

Chunxia Wang

¹,

Mohui Jin

² and

Jieli Duan

^2,*

¹

Sichuan Academy of Agricultural Machinery Sciences, Chengdu 610066, China

²

College of Engineering, South China Agricultural University, Guangzhou 510642, China

³

School of Mechanical Engineering, Guangdong Ocean University, Zhanjiang 524088, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2025, 15(8), 1880; https://doi.org/10.3390/agronomy15081880

Submission received: 30 June 2025 / Revised: 30 July 2025 / Accepted: 1 August 2025 / Published: 3 August 2025

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Banana de-handing is a critical yet labor-intensive step in postharvest processing, with current manual methods resulting in high costs and occupational risks. This study addresses the automation of de-handing point localization by integrating high-resolution 3D scanning and morphometric analysis of banana crowns with machine learning techniques. A total of 210 crown samples were analyzed to extract key morphological features, including inner arc length (L_i), inner arc radius (R_i), outer arc radius (R_o), and the distance between inner and outer arcs (D_oi), among others. Four machine learning algorithms, namely, Multi-Layer Perceptron (MLP), Gradient Boosted Decision Trees (GBDT), Extreme Gradient Boosting (XGBoost), and Random Forest (RF), were developed to predict the target radius (R_t) and target distance (D_ti) of the de-handing point. The RF models achieved the optimal predictive performance on the testing set, with the following results: for R_t, R² = 0.95, MAE = 1.50, and RMSE = 1.94; for D_ti, R² = 0.91, MAE = 1.33, and RMSE = 1.66. A Shapley Additive Explanations (SHAP) analysis revealed that L_i, R_i, and R_o were the most influential features for R_t, while D_oi was the most important for D_ti. Notably, feature threshold effects were observed, with limited gains in prediction accuracy beyond specific morphological values. These results provide a quantitative foundation for vision-guided automated de-handing systems, advancing intelligent and efficient banana postharvest management.

Keywords:

banana crown; de-handing point; morphological feature; machine learning; SHAP analysis

1. Introduction

Bananas are globally significant tropical fruits, with annual production that exceeds 170 million metric tons and cultivation occurring in more than 140 countries [1]. They function not only as a staple food for more than 400 million smallholder farmers but also as a principal source of their income. However, the post-harvest processing of bananas is crucial for preserving fruit quality and sustaining commercial value. Among these processes, de-handing plays a pivotal role in determining the efficiency and standards of subsequent handling, transportation, and sales [2]. Traditional manual de-handing requires a considerable amount of skilled labor, resulting in elevated operational costs and a heightened risk of musculoskeletal disorders [3]. This situation highlights the urgent necessity for automated alternatives.

Although initial studies on mechanical de-handing prototypes have been conducted [4,5], automated de-handing systems that utilize crown morphology remain largely unexplored in the academic literature. Developing such automated solutions is critical for enhancing harvesting efficiency, reducing labor dependence, and improving occupational safety in commercial banana production [6].

Unprocessed banana bunches typically consist of seven hands that are radially distributed and helically arranged around a central stalk [7], as shown in Figure 1a. The crown, serving as a vital connective structure between the individual hands and the central stalk, exhibits distinct morphological gradients from the apical to the basal positions (Figure 1b). These positional morphological differences directly influence the optimal parameters for de-handing, underscoring the need for quantitative characterization to enable automated de-handing implementation.

A previous study anatomically divided the crown into three distinct regions: the central stalk–crown transition region (CSCTR), the crown expansion region (CER), and the crown–finger transition region (CFTR). Among these, the CER has been identified as the biomechanically optimal region for de-handing (Figure 1c) [8].

To standardize the evaluation process, this study designates the midpoint of the CER as the target de-handing point and introduces two novel morphological descriptors: the target radius (R_t), which defines the transverse location, and the target distance (D_ti), which quantifies the longitudinal position relative to the central stalk.

Recent advances in agricultural machine learning have primarily concentrated on yield forecasting for staple crops, including winter wheat [9], maize [10,11], and rice [12]. Concurrently, automated fruit harvesting has emerged as a pivotal research area, with computer vision-based machine learning increasingly employed for localizing fruit picking points. Current automated harvesting research is mainly directed toward fruits with standardized morphological characteristics, including spherical apples [13], ovoid citrus [14], and ellipsoidal tomatoes [15]. Computer vision-based machine learning systems have demonstrated robust performance in picking point localization for such geometrically consistent crops by leveraging morphological parametric descriptors [16,17,18]. However, the pronounced morphological differences of banana crowns, which are characterized by transverse expansion and longitudinal differentiation from apical to basal positions (Figure 1b), precludes the direct application of existing methods to banana de-handing point localization. Therefore, dedicated models capable of learning morphometric variation in irregular plant structures are required for de-handing point localization in bananas.

In this context, machine learning algorithms have demonstrated considerable potential in crop morphometry and growth-related prediction tasks. For example, Multi-Layer Perceptron (MLP) achieved a coefficient of determination (R²) of 0.93 when predicting the crown diameter and leaf area index of pomegranate under conditions of salinity and drought stress, highlighting their ability to model nonlinear interactions between morphological traits and environmental factors [19]. Gradient Boosted Decision Trees (GBDT) yielded an R² of 0.88 in maize leaf age prediction by integrating UAV-derived RGB and multispectral data [20]. Extreme Gradient Boosting (XGBoost) outperformed conventional regression models in sunflower height estimation, achieving a root mean square error (RMSE) of 8.73 cm by leveraging its efficient feature selection capabilities with high-dimensional sentinel satellite data [21]. Random Forest (RF) has also been validated for crop growth analysis, successfully predicting yield based on climatic and biophysical variables and exhibiting strong interpretability and resilience to noise within agricultural datasets [22]. Collectively, these studies justify the selection of MLP, GBDT, XGBoost, and RF for modeling the complex morphological characteristics of banana crowns, which demand algorithms that are tolerant to structural heterogeneity and sensor-derived noise.

Most existing automated harvesting studies concentrate on fruits with standardized geometries, whereas banana crowns exhibit intricate transverse expansion and longitudinal differentiation. To overcome these challenges, this study introduces a comprehensive methodological framework. It proposes standardized morphological descriptors by defining R_t and D_ti as quantitative indices for localizing de-handing points and develops a three-dimensional morphological analysis protocol using structured-light scanning to enable the non-destructive digitization of crown geometry. Additionally, the framework constructs predictive models by adapting four machine learning algorithms—MLP, GBDT, XGBoost, and RF—to estimate de-handing points based on morphological features and employs interpretable machine learning techniques, specifically Shapley Additive Explanations (SHAP), to assess the relative importance of features in predicting R_t and D_ti. This systematic approach establishes a quantitative basis for vision-guided automated de-handing and promotes the adoption of agricultural robotics in banana postharvest handling. Furthermore, the proposed methodology can be extended to cutting point identification in the automated harvesting of other morphologically complex crops.

2. Materials and Methods

2.1. A Systematic Workflow for Modeling Banana Crown De-Handing

This study presents a structured workflow for developing interpretable machine learning models to locate the banana crown de-handing point (R_t and D_ti) using morphological feature data, as shown in Figure 2.

The workflow encompasses four sequential stages. First, data collection involves the standardized preparation of banana crown samples, extraction and measurement of key morphological features, and random division into training (85%) and testing (15%) sets. Second, model development includes training predictive models to predict the de-handing points using four machine learning algorithms (MLP, GBDT, XGBoost, and RF). Hyperparameter tuning was performed using grid search combined with 5-fold cross-validation, and model performance was evaluated using the R², RMSE, and mean absolute error (MAE). Third, model selection and interpretation entail identifying the optimal de-handing point models through a comparative analysis of performance metrics, followed finally by the examination of model decision mechanisms using SHAP values to quantify feature contributions and improve interpretability.

This integrated workflow ensures rigorous progression from data collection to interpretable model deployment, offering a systematic foundation for machine learning-based research on banana crown de-handing. Moreover, the developed methodology is transferable to identifying the cutting point in the automated harvesting of other morphologically complex crops.

2.2. Banana Crown Morphology Data Collection

2.2.1. Crown Sample Preparation

Fresh banana bunches at the green stage (approximately 110 days after anthesis) were obtained from the Tianping Wholesale Fruit Market in Guangzhou, China. All bunches were confirmed to be of the Brazilian cultivar (Musa acuminata, AAA group). After acquisition, the samples were promptly transported to the laboratory for subsequent experiments, including the extraction and measurement of crown morphological characteristics.

The complex three-dimensional architecture of banana crowns, characterized by intricate morphological features such as overlapping fingers and irregular curvature, posed considerable technical challenges for precise analysis. To address these challenges, high-resolution sensing technologies were required to capture fine structural details, which were essential for tasks including geometric modeling, crown segmentation, and robotic manipulation—critical steps in the development of automated postharvest solutions.

Vision-based sensing technologies, including RGB-D cameras and structured-light three-dimensional scanners, have commonly been employed in agricultural automation. RGB-D cameras, such as the Intel RealSense D435(i), can provide cost-effective and real-time data acquisition capabilities. However, these devices have exhibited several limitations, such as low spatial resolution, susceptibility to ambient light, and depth noise that increased quadratically with distance, often resulting in substantial root mean square errors at extended ranges [23]. These limitations reduce their suitability for high-precision tasks, such as the accurate reconstruction of banana crowns, where even minor measurement errors could propagate into significant discrepancies in downstream applications.

In contrast, structured-light three-dimensional scanners—such as the Go!SCAN 50 (Creaform Inc., Lévis, QC, Canada), which was employed in this study—offer high precision, with a spatial resolution of 0.5 mm and a volumetric accuracy of zero point 0.30 mm/m (Table 1). These characteristics make them suitable for reconstructing the intricate surfaces of banana crowns. The high accuracy of these scanners facilitates the generation of low-error point clouds that support the modeling of complex morphological structures, which are critical for automated de-handing applications. Their effectiveness was previously validated in agricultural contexts [24], confirming their suitability for addressing the structural challenges associated with banana crown geometry.

To perform non-destructive crown reconstruction, 30 banana bunches—each comprising 7 intact crowns—were scanned, resulting in a dataset of 210 crown samples for morphological analysis. The key specifications of the Go!SCAN 50 scanner, including its spatial resolution and volumetric accuracy, are listed in Table 1.

Prior to scanning, fiducial markers were strategically affixed to the surfaces of the crowns to enhance spatial registration accuracy (Figure 3a) [25]. During scanning, the crown samples were immobilized while the handheld scanner was dynamically repositioned to ensure comprehensive surface coverage (Figure 3b). Raw point cloud data were acquired using the manufacturer-provided VXelements3D software (Figure 3c).

The acquired raw point clouds were preprocessed to remove artifacts, including peripheral noise, spike anomalies, and non-manifold geometries. This preprocessing was conducted using Geomagic Wrap software (Version 2021, 3D Systems Inc., Rock Hill, SC, USA) [26]. The complete workflow is illustrated in Figure S1.

2.2.2. Extraction and Measurement of Crown Morphological Features

Crown morphometric analysis was conducted using Geomagic Design X software (Version 2019, 3D Systems Inc., Rock Hill, SC, USA). Due to the unique structural complexity of the crown, a circle-fitting method was employed to extract and quantify its morphological features [27]. The transverse and longitudinal morphological parameters were derived through a series of sequential analytical steps.

Boundary Arc Identification

Inner arc: Defined as the transitional interface between the central stalk and the CSCTR.

Outer arc: Defined at the transitional interface between the CFTR and the fingers.

These boundary arcs were extracted from the crown’s basal plane (Figure 4a) and served as fiducial references for morphological quantification.

Transverse Morphology Parameters

Six key descriptors were extracted: inner arc radius (R_i), inner arc central angle (α_i), inner arc length (L_i), outer arc radius (R_o), outer arc central angle (α_o), and outer arc length (L_o), as illustrated in Figure 4a.

Longitudinal Morphology Quantification

The centers of the inner and outer arcs typically do not coincide. To ensure measurement consistency, the distance between the midpoints of the two arcs was defined as the distance between the inner and outer arcs (D_oi), as shown in Figure 4a. The parameter D_oi was used to evaluate the longitudinal growth morphology of the crown.

This process enabled the systematic extraction and measurement of the morphological features of both the inner and outer arcs of the crown.

The CER, identified as the optimal de-handing region, was quantified using apex-oriented measurements due to its distinct morphological characteristics at the crown apex (Figure 4b). To enable systematic evaluation in subsequent machine learning models, the midpoint of the de-handing region was operationally defined as the target de-handing point. The corresponding circumferential profile at this point was designated as the target arc, with its radius measured as R_t. The longitudinal displacement between the inner arc and the target arc was defined as D_ti. Both parameters were determined using the circle-fitting method implemented in Geomagic Design X software (Version 2019, 3D Systems Inc., Rock Hill, SC, USA).

2.3. Machine Learning Algorithms and Hyperparameter Tuning

Four supervised machine learning algorithms were evaluated to predict the crown de-handing point parameters, namely R_t and D_ti. The algorithms included MLP, GBDT, XGBoost, and RF. These models were selected due to their demonstrated capability in modeling nonlinear relationships among continuous morphological features and their documented success in a variety of agricultural prediction tasks [28].

MLP is a type of feedforward artificial neural network employed in this study to approximate highly nonlinear functions through multiple layered transformations. It is particularly effective in modeling complex feature interactions and multi-dimensional dependencies within structured morphometric datasets and offers adaptability through its configurable architecture.

GBDT is an ensemble-based algorithm that constructs Decision Trees sequentially, with each successive tree designed to reduce the residual error of its predecessor. This iterative process enables the model to represent complex, non-additive feature relationships, making it well-suited for structured tabular data where capturing intricate patterns and determining feature importance are essential.

XGBoost is an enhanced implementation of the GBDT framework that incorporates parallelized computation and additional regularization strategies. It offers efficient training performance, strong scalability, and consistent predictive accuracy in structured data applications.

RF is a bagging-based ensemble algorithm that constructs multiple randomized Decision Trees and aggregates their predictions to generate stable and accurate outputs. It demonstrates high robustness to noisy input data, competitive performance in high-dimensional feature spaces, and a reduced tendency to overfit, owing to its ensemble structure and randomized feature selection mechanism.

All models were implemented in Python version 3.8.12 using the Scikit-learn library. A grid search combined with 5-fold cross-validation was conducted to optimize the model hyperparameters. The parameter tuning procedures were as follows.

For the MLP model, the key hyperparameters included the learning_rate and the hidden_layer_sizes [29]. The learning_rate, with a search range from 0.01 to 0.10, determined the step size for gradient-based weight updates. Higher values tended to accelerate convergence but increased the risk of overshooting the optimal solution, whereas lower values improved precision at the cost of extended training time. The hidden_layer_sizes, ranging from 30 to 100 neurons, controlled the model’s capacity. Larger architectures enhanced the ability to learn nonlinear relationships but also increased the risk of overfitting—an important consideration given the moderate dataset size of 210 samples.

For both the GBDT and XGBoost models, the learning_rate and the max_depth parameters were optimized [30]. The learning_rate, which ranged from 0.01 to 0.10, determined the contribution of each tree to the ensemble’s prediction, balancing model stability and convergence speed. Identical parameter ranges enabled direct algorithmic comparison under consistent conditions. The max_depth parameter, ranging from 1 to 10, controlled the complexity of individual Decision Trees. Deeper trees improved the modeling of nonlinear feature interactions but also increased computational burden and the risk of overfitting. This depth range is considered standard practice for structured data tasks, allowing control over model complexity.

For the RF model, two key hyperparameters were optimized: the n_estimators and the max_features [31]. The n_estimators, with a search range from 1 to 20, controlled the total number of Decision Trees in the ensemble. Increasing this value enhanced model robustness by reducing variance, although it proportionally increased training time. The max_features, ranging from 1 to 7, determined the subset of features considered during each node split, thereby helping to reduce bias caused by collinear crown features.

A summary of the hyperparameter search ranges used for R_t and D_ti prediction is presented in Table 2.

2.4. De-Handing Point Model Training and Evaluation

In this study, the dataset was randomly split into a training set (85%) and an independent testing set (15%) using simple random sampling. The training set was used to develop the de-handing point models for R_t and D_ti using the MLP, GBDT, XGBoost, and RF algorithms. Hyperparameter optimization was performed via grid search combined with 5-fold cross-validation [32]. The hyperparameter configuration that resulted in the lowest RMSE was selected as optimal for both the R_t and D_ti models. The testing set was held out for evaluating the generalization performance of the models.

After determining the optimal hyperparameters, the R_t and D_ti models were retrained on the training set and subsequently evaluated for generalization performance on the testing set. Model accuracy was quantified using the RMSE, MAE, and R², which measure the deviation and goodness of fit between actual and predicted values. Model performance was assessed based on these metrics; lower RMSE and MAE values and higher R² values indicated better predictive accuracy and overall model performance. The formulas for RMSE, MAE, and R² are as follows:

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(1)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - {\hat{y}}_{i}|

(2)

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \overline{y})}^{2}}

(3)

where

y_{i}

is the actual value of the output target,

{\hat{y}}_{i}

is the predicted value of the model, and

\overline{y}

is the mean value of the output target.

2.5. De-Handing Point Model Interpretation

SHAP, a model-agnostic interpretability method grounded in cooperative game theory, was employed to quantify the contribution of each feature to model predictions [33]. SHAP values, derived from Shapley values in game theory, assign numerical importance scores to individual features by evaluating their marginal impact on predictions across all possible feature combinations. These values distinguish between features that positively or negatively influence outcomes, while also accounting for inter-feature interactions. Unlike post hoc interpretation methods that are limited to specific model classes, SHAP provides globally consistent explanations applicable to a wide range of machine learning models.

In this study, SHAP analysis was applied to the optimal R_t and D_ti models to achieve three objectives: (1) identifying critical morphological determinants through feature importance ranking, (2) quantifying the directional relationships between features and predictions via partial dependence analysis, and (3) visualizing nonlinear interactions among crown morphological parameters.

3. Results

3.1. Banana Crown Morphometric Measurement Data

All morphometric procedures were performed in triplicate to ensure measurement reproducibility. The resulting datasets for the inner arc, outer arc, and target arc parameters were subjected to comprehensive statistical analysis (Table S1), with key metrics (maximum, minimum, mean, and standard deviation) summarized in Table 3.

3.2. Hyperparameter Optimization for De-Handing Point Models

3.2.1. MLP-Based Models

As shown in Figure 5a, the MLP-based R_t model exhibited reduced convergence stability when the learning_rate exceeded 0.04, accompanied by a significant increase in RMSE as the size of the hidden_layer_sizes increased. In contrast, a learning_rate below 0.04 improved convergence stability and reduced RMSE. To balance convergence efficiency and generalization, the optimal hyperparameters for the R_t model were determined to be a learning_rate of 0.03 and hidden_layer_sizes of 50. Similarly, Figure 5b shows that the D_ti model achieved optimal performance with a learning_rate of 0.04 and hidden_layer_sizes of 40.

3.2.2. GBDT-Based and XGBoost-Based Models

For the GBDT-based R_t model (Figure 6a), a learning_rate below 0.03 resulted in higher RMSE due to insufficient gradient updates, while values above 0.03, particularly when combined with a max_depth greater than 5, exacerbated overfitting. The optimal configuration was identified as a learning_rate of 0.04 and a max_depth of 2. For the D_ti model (Figure 6b), the results indicated heightened sensitivity to max_depth, with RMSE increasing sharply at depths greater than 1. Consequently, a learning_rate of 0.04 and a max_depth of 1 were selected as the optimal hyperparameters.

The XGBoost-based R_t and D_ti models (Figure 7a,b) demonstrated stable performance across different max_depth values when the learning_rate exceeded 0.03. To prioritize model generalization, the optimal hyperparameters for the R_t model were set to a learning_rate of 0.06 and a max_depth of 1, while the D_ti model utilized a learning_rate of 0.04 and a max_depth of 1.

3.2.3. RF-Based Models

The RF-based R_t model (Figure 8a) exhibited reduced performance when n_estimators were fewer than 14. Additionally, excessive max_features, particularly values greater than 3 in conjunction with higher n_estimators, further diminished model performance. Similar trends were observed for the D_ti model (Figure 8b). The final configurations selected were n_estimators of 14 for the R_t model, n_estimators of 8 for the D_ti model, and max_features of 3 for both models.

3.3. Performance Evaluation of De-Handing Point Models

3.3.1. R_t Models

Figure 9 presents a comparison of the training and testing performance of the MLP, GBDT, XGBoost, and RF models in predicting the transverse location (R_t) of the de-handing point. The MLP-based model exhibited the weakest performance in predicting R_t (R² = 0.86, MAE = 2.54, and RMSE = 2.92), with substantial deviations from the y = x reference line (Figure 9a). This limitation primarily resulted from the MLP-based model’s high sensitivity to data noise when trained on limited sample sizes [34], a challenge frequently encountered in agricultural applications, such as crop yield prediction, where MLP-based models often require extensive hyperparameter tuning to achieve stable generalization [35].

Conversely, the GBDT-based model (Figure 9b) achieved the highest fitting accuracy on the training set, with R² = 0.99, MAE = 0.37, and RMSE = 0.48. The predicted values were closely aligned with the y = x reference line, indicating strong performance in fitting the training data. Nevertheless, the GBDT-based model exhibited overfitting when applied to the testing set, as evidenced by increases in MAE and RMSE from the training to the testing set (MAE increased by 1.08 and RMSE increased by 1.65), the highest increase observed among all models. This result suggests that the GBDT-based model over-adapted to the training data—a common issue in Gradient Boosting models when regularization is insufficient [36,37].

The XGBoost-based and RF-based models showed strong intermediate performance on the training set, with R² = 0.98 for both models, MAE values of 0.74 and 0.73, and RMSE values of 0.96 and 0.99, respectively (Figure 9c,d). When evaluated on the testing set, both models maintained robust generalization, with predictions closely clustered along the y = x reference line. Notably, the RF-based model demonstrated more stable generalization, as indicated by smaller absolute increases in error metrics from the training to testing phases. Specifically, the RF-based model exhibited an MAE increase of 0.77 and an RMSE increase of 0.95, while the XGBoost-based model showed larger increments of 0.82 for MAE and 1.07 for RMSE. These smaller increases suggest that the RF-based model was more robust against overfitting. This stability can be attributed to the ensemble structure of the RF algorithm, which builds diverse Decision Trees through randomized feature and data sampling—an inherent design that mitigates overfitting [38]. Given its demonstrated efficacy in handling agricultural data noise [39,40], the RF-based model is identified as the most suitable choice for predicting R_t in this study.

3.3.2. D_ti Models

For the prediction of the longitudinal location (D_ti) of the de-handing point, similar trends were observed across all four models (Figure 10). The MLP-based model again exhibited the poorest predictive performance among all models (R² = 0.71, MAE = 3.96, and RMSE = 4.20), with substantial deviations from the actual values (Figure 10a). This consistently poor performance highlights the fundamental limitations of MLP-based models in agricultural contexts, particularly when dealing with sparse datasets characterized by high measurement noise, such as those related to banana crown morphology. Its high degree of parameterization and low tolerance for noise resulted in overfitting. These findings align with previously reported limitations of MLP-based models in predicting complex traits in spring wheat [41], confirming that MLP architectures require large and diverse training datasets to achieve robust generalization [42].

The GBDT-based model demonstrated strong performance on the training set (R² = 0.99, MAE = 0.34, and RMSE = 0.44); however, its performance declined substantially on the testing set, with R² decreasing to 0.88, MAE increasing to 1.36, and RMSE rising to 1.95 (Figure 10b), suggesting susceptibility to overfitting. As observed in the R_t prediction, the GBDT-based model failed to generalize well to unseen data, underscoring the need for incorporating regularization techniques or early stopping strategies to mitigate overfitting in Gradient Boosting frameworks [43,44].

Similarly, the XGBoost-based model also exhibited signs of overfitting in D_ti prediction, as evidenced by declines in all performance metrics from the training set to the testing set: R² decreased from 0.97 to 0.87, MAE increased from 0.73 to 1.55, and RMSE rose from 0.94 to 2.01 (Figure 10c).

In contrast, the RF-based model demonstrated stable and consistent performance across both training and testing sets. For the training set, R² = 0.97, MAE = 0.70, and RMSE = 0.92, while for the testing set, R² = 0.91, MAE = 1.33, and RMSE = 1.66 (Figure 10d). This stability, coupled with strong generalization capability, identifies the RF-based model as the most reliable choice for D_ti prediction, further supporting its selection as the optimal model for both R_t and D_ti predictions in this study.

Overall, these results demonstrate that RF-based models offer the most robust and reliable performance for de-handing point prediction, effectively capturing the transverse (R_t) and longitudinal (D_ti) locations while maintaining a superior balance between predictive accuracy, generalization ability, and computational efficiency.

3.4. Interpretation of De-Handing Point Models

3.4.1. Feature Importance Analysis of RF-Based R_t Model

Figure 11a illustrates the global feature importance rankings for predicting the transverse location (R_t), as determined by the mean absolute SHAP values. Among all input features, L_i and R_i were identified as the most influential contributors, with respective contributions of 43.63% and 26.01%. R_o and D_oi provided secondary influence, contributing 14.07% and 9.59%, respectively. In contrast, L_o, α_o, and α_i collectively accounted for only 6.70%, indicating a negligible role in predicting R_t. These results demonstrate that the transverse morphological features (L_i, R_i, and R_o) primarily govern the prediction of the transverse location of the de-handing point.

Figure 11b visualizes each feature’s contribution to the RF-based R_t prediction based on SHAP value distributions. Features are sorted in descending order along the vertical axis, with corresponding SHAP values plotted on the horizontal axis. Red dots represent higher feature values, while blue dots represent lower ones. A substantial contribution is indicated when a feature exhibits a wide SHAP value range, minimal overlap between red and blue dots, and a distinct gradient—where lower feature values (blue) are associated with negative SHAP values and higher values (red) align with positive SHAP values [45]. Conversely, features with narrow SHAP value ranges and significant overlap between high and low feature values suggest limited predictive influence.

As shown in Figure 11b, L_i demonstrated the widest range of SHAP values, indicating the strongest contribution to R_t prediction. Further analysis revealed that low L_i values (blue) corresponded to negative SHAP values (leftward shift), while high L_i values (red) were associated with positive SHAP values (rightward shift). This pattern suggests that L_i has a strong directional influence on R_t. R_i exhibited a similar trend but with greater red–blue overlap, indicating a relatively reduced contribution compared to L_i. R_o and D_oi displayed moderate SHAP value ranges, but substantial overlap between red and blue dots diminished their predictive power. By contrast, the minimal contributions of L_o, α_o, and α_i were reflected by their near-zero SHAP value ranges and almost complete red–blue overlap, suggesting negligible relevance to R_t prediction.

Figure 11c highlights potential morphological thresholds influencing the R_t of the de-handing point. L_i values less than 76 mm were associated with R_t locations in physically inaccessible crown regions, as reflected by peak negative SHAP values, potentially indicating limited de-handing tool access due to underdeveloped crown structures. As L_i increased from 76 mm to 136 mm, SHAP values progressively increased, implying enhanced accessibility. Beyond 136 mm, the SHAP values plateaued or declined, suggesting reduced marginal gain in tool access. R_i values below 26 mm were linked to constrained R_t locations, as indicated by peak negative SHAP values—possibly caused by excessive curvature hindering tool movement. An increase in R_i from 26 mm to 45 mm was associated with improved R_t reliability, evidenced by rising SHAP values that subsequently stabilized beyond 45 mm.

R_o values below 38 mm corresponded to more limited R_t ranges, with negative SHAP values possibly reflecting compression effects from the outer crown structure. Between 38 mm and 72 mm, the SHAP values were near zero, suggesting negligible influence, while limited data beyond 72 mm prevented definitive conclusions. For D_oi, values less than 39 mm yielded SHAP values clustered around zero, indicating limited influence. However, some instances beyond 39 mm exhibited rising SHAP values, implying improved access only under certain morphological conditions.

Finally, the near-zero clustering of SHAP values in both summary distributions (Figure 11b) and flat response patterns in partial dependence plots (Figure 11c) confirms that α_i, α_o, and L_o contribute negligibly to R_t prediction. These angular parameters may be excluded from future morphological models aimed at predicting the transverse de-handing location.

3.4.2. Feature Importance Analysis of D_ti Model

Figure 12a presents the global feature importance ranking for predicting the longitudinal location (D_ti) based on mean absolute SHAP values. D_oi was identified as the most influential predictor, contributing 31.16% and primarily determining the achievable longitudinal displacement range. L_i (25.96%), R_i (16.95%), and R_o (14.22%) also affected longitudinal location through transverse structural constraints, whereas α_o, α_i, and L_o contributed a combined 11.71%, suggesting limited relevance.

Figure 12b illustrates each feature’s contribution to the prediction of the D_ti model based on SHAP value distributions. D_oi stood out with a broad SHAP spread and clear directional pattern: smaller values were linked to lower D_ti predictions, while larger values corresponded to higher predicted locations. L_i showed moderate SHAP variation but with less distinct separation, indicating a conditional contribution. R_i and R_o had limited influence, as their SHAP values were concentrated within narrow ranges and exhibited substantial value overlap. L_o, α_o, and α_i showed negligible importance, with SHAP values clustering near zero throughout, suggesting minimal relevance to D_ti prediction.

Figure 12c delineates the morphological thresholds governing longitudinal location (D_ti) of the de-handing point. For D_oi, values less than 30 mm corresponded to constrained D_ti locations, with tightly clustered negative SHAP values suggesting spatial limitation due to reduced arc separation. Between 30 mm and 50 mm, a progressive increase in SHAP values was observed, indicating improved accessibility. Beyond 50 mm, the SHAP values plateaued, suggesting that further spatial separation did not enhance the prediction, implying a threshold of effective displacement. In the case of L_i, values less than 83 mm were associated with negative SHAP values and correspondingly predicted D_ti locations, indicating restricted longitudinal localization. As L_i increased to between 83 mm and 138 mm, the SHAP values increased steadily, supporting enhanced positional potential. Above 140 mm, SHAP contributions diminished, suggesting the presence of an optimization ceiling for inner arch elongation.

Sub-threshold values of R_i (less than 28 mm) and R_o (less than 50 mm) yielded predominantly small negative SHAP values, indicating a mild but consistent limiting effect on the prediction of D_ti. In intermediate ranges (R_i: 28–34 mm; R_o: 50–60 mm), the SHAP values remained near zero, indicating minimal effect. For supra-threshold values (R_i greater than 34 mm; R_o greater than 60 mm), SHAP distributions again clustered near zero, with only isolated positive deviations. This pattern suggests that increased curvature radii do not consistently enhance D_ti predictability.

Angular parameters (α_i and α_o) and L_o demonstrated negligible contributions across all observed ranges, as evidenced by SHAP distributions clustered near zero. Consequently, these features can be safely excluded from future crown morphological protocols aimed at predicting the longitudinal de-handing location.

4. Discussion

4.1. Morphological Drivers Governing De-Handing Location

The SHAP analysis reveals that spatial dimensional constraints (L_i, R_i, and D_oi) are the primary determinants of transverse (R_t) and longitudinal (D_ti) de-handing locations, with angular parameters (α_i and α_o) and L_o contributing minimally. R_t, L_i, and R_i collectively account for 69.64% of the predictive power, governing access space and inner arc curvature; D_ti, D_oi (31.16%), and L_i (25.96%) dominate by defining the inter-arc working distance. Critical thresholds—L_i < 76 mm, R_i < 26 mm, and D_oi < 30 mm—emerge as morphological bottlenecks restricting feasible operations.

This emphasis on three-dimensional spatial features aligns with broader insights into fruit-picking robots. Consistent with findings on three-dimensional geometric features in juicy peach segmentation [46], which show that such attributes (e.g., surface normals and curvature) outperform two-dimensional or color features, our results reinforce that spatial dimensions, not angular metrics, guide robotic positioning. Similarly, work on guava harvesting highlights that three-dimensional spatial relationships—such as fruit–stem distances and curvature thresholds—directly dictate collision-free picking positions, mirroring our observation that morphological bottlenecks (e.g., L_i < 76 mm) constrain operational feasibility [47]. These consistencies underscore a unifying principle: in unstructured agricultural environments, three-dimensional morphological constraints are the cornerstone of robotic operation positioning. However, the three-dimensional morphological data underlying our analysis were acquired via time-consuming scanning techniques, which contrasts with studies utilizing portable RGB-D imaging for real-time morphology quantification, potentially limiting the real-time applicability of our findings in field settings where rapid data capture is essential.

4.2. Engineering Specifications for Automated De-Handing Systems

Morphological thresholds derived from SHAP analysis establish core design constraints for automated de-handing systems: end effectors must have a width < 76 mm (to clear L_i < 76 mm inner arcs), a curvature radius < 26 mm (to fit R_i < 26 mm high-curvature crown inner regions), and a linear actuator stroke > 30 mm (to navigate D_oi < 30 mm inter-arc spaces). Saturation effects—such as negligible gains beyond R_i > 34 mm—further delimit over-engineering.

These specifications build on task-specific design principles in the literature. Optimizing end-effector parameters to match crop morphology, as demonstrated in cherry tomato harvesting [48] where finger dimensions were tailored to fruit diameter and spacing, validates our approach of deriving constraints directly from banana crown features. Supporting this, a review of harvesting robots [49] highlights that end-effector adaptability to crop morphology—via size and curvature tuning—directly drives operational success, echoing our focus on miniaturization and kinematic precision.

Notably, the need to balance morphological adaptability with efficiency [50,51] resonates with our saturation thresholds (e.g., no gains beyond 136 mm width), which avoid unnecessary complexity. Collectively, these connections position our specifications within a robust framework of morphology-driven robotic design, bridging specific crop needs with generalizable engineering principles. However, the current specifications are derived exclusively from Cavendish bananas, and their compatibility with other banana groups featuring distinct crown architectures remains untested—this restricts their applicability in diverse agricultural contexts and generalizability across banana varieties.

4.3. Limitations and Future Works

This study has several limitations that should be addressed in future research:

1. The thresholds established in this study were derived solely from Cavendish bananas (Musa acuminata, AAA group), and their applicability to plantains (ABB group) or cooking bananas, which have different crown architectures, remains unverified [52]. Future work should include a broader sample of Cavendish bananas and incorporate other commercially important banana varieties to enhance the model’s robustness and transferability.

2. The three-dimensional scanning technique used for feature extraction has inherent limitations in field applications, including time-consuming data acquisition, high equipment costs, and limited operational deployability [53]. Future research should focus on developing faster, cost-effective alternatives, such as portable RGB-D imaging integrated with advanced deep learning models, enabling automated morphology quantification without the need for manual feature extraction.

3. While SHAP thresholds suggest associations between restricted morphology (L_i < 76 mm) and inaccessible de-handing locations, further validation of feature importance is needed. Future work will explore multi-method comparisons, including contrasting SHAP-based feature importance with traditional metrics such as Gini impurity and entropy in tree-based models, as well as incorporating sensitivity analysis to assess how perturbations in key features (e.g., L_i, R_i, D_oi) affect prediction outcomes [54]. These complementary approaches will help cross-verify the consistency and robustness of the identified important features, strengthening the reliability of our morphological driver conclusions.

5. Conclusions

This study presents a quantitative framework for predicting banana de-handing points by integrating crown morphological features with machine learning. By employing high-resolution three-dimensional scanning and introducing novel descriptors—R_t for transverse location and D_ti for longitudinal location—we achieved precise morphological characterization. Among the evaluated algorithms, the RF model delivered the best predictive performance (R_t: R² = 0.95, MAE = 1.50, RMSE = 1.94; D_ti: R² = 0.91, MAE = 1.33, RMSE = 1.66). The SHAP analysis further highlighted L_i, R_i, and R_o as dominant features for R_t prediction, while D_oi governed D_ti, with critical thresholds (L_i > 76 mm, R_i > 26 mm, and D_oi > 30 mm) enhancing interpretability.

This work makes three key contributions to automated de-handing systems: (1) it provides quantifiable morphological thresholds (L_i > 76 mm, R_i > 26 mm, and D_oi > 30 mm), addressing the lack of data-driven spatial accessibility standards for robotic end effectors with mechanisms; (2) it validates the RF model’s competitive advantage in modeling noisy, irregular plant morphology with limited samples (n = 210); (3) it develops an interpretable SHAP-based mechanism confirming that spatial dimensions (arc lengths/radii)—rather than angular parameters—are primary predictors, simplifying future sensor deployment requirements.

In future work, we will focus on validating the proposed methodology across diverse banana varieties and integrating deep learning models with RGB-D imaging to streamline feature extraction, aiming to enhance the practical applicability of the automated de-handing system in field conditions.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/agronomy15081880/s1, Figure S1: Illustration of recondition process of crown model; Table S1: Measurements of morphological parameters of crown at different positions on banana bunch.

Author Contributions

L.Z.: conceptualization, methodology, writing—original draft preparation, writing—review and editing. Z.Y.: methodology, resources, supervision, funding acquisition. C.W.: validation, formal analysis, writing—review and editing, visualization. M.J.: software, validation, investigation. J.D.: conceptualization, data curation, writing—review and editing, project administration, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 32271996), the China Agriculture Research System of MOF and MARA (Grant No. CARS-31-11), and the open competition program of the top ten critical priorities of Agricultural Science and Technology Innovation for the 14th Five-Year Plan of Guangdong Province (Grant No. 2023SDZG03), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2025A1515011219).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the need to protect the commercial information of the wholesale markets where the banana samples were purchased, including specific supplier details and transaction-related parameters that are confidential to the market operators.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Tripathi, J.N.; Ntui, V.O.; Tripathi, L. Precision Genetics Tools for Genetic Improvement of Banana. Plant Genome 2024, 17, e20416. [Google Scholar] [CrossRef] [PubMed]
Kamel, M.A.M.; Cortesi, P.; Saracchi, M. Etiological Agents of Crown Rot of Organic Bananas in Dominican Republic. Postharvest Biol. Technol. 2016, 120, 112–120. [Google Scholar] [CrossRef]
Merino, G.; Da Silva, L.; Mattos, D.; Guimarães, B.; Merino, E. Ergonomic Evaluation of the Musculoskeletal Risks in a Banana Harvesting Activity through Qualitative and Quantitative Measures, with Emphasis on Motion Capture (Xsens) and EMG. Int. J. Ind. Ergon. 2019, 69, 80–89. [Google Scholar] [CrossRef]
Guo, J.; Karkee, M.; Yang, Z.; Fu, H.; Li, J.; Jiang, Y.; Jiang, T.; Liu, E.; Duan, J. Research of Simulation Analysis and Experimental Optimization of Banana De-Handing Device with Self-Adaptive Profiling Function. Comput. Electron. Agric. 2021, 185, 106148. [Google Scholar] [CrossRef]
Xu, Z.Y.; Yang, Z.; Duan, J.L.; Jin, M.H.; Mo, J.S.; Zhao, L.; Guo, J.; Yao, H.L. Design and Experiment of Symmetrical Shape Deployable Arc Profiling Mechanism Based on Composite Multi-Cam Structure. Symmetry 2019, 11, 958. [Google Scholar] [CrossRef]
Fuentes, N.N.M.; Valle, J.A.B.; Gavilanes, D.A.P.; Loor, M.A.E. Riesgos Ergonómicos En El Trabajo En La Industria Bananera de La Costa Ecuatoriana. Religación 2024, 9, e2401232. [Google Scholar] [CrossRef]
Robinson, J.C.; Saúco, V.G. Bananas and Plantains; CABI: Oxfordshire, UK, 2010; pp. 248–253. [Google Scholar]
Zhao, L.; Huang, C.; Yang, Z.; Jin, M.; Duan, J. Characterization of Banana Crowns: Microscopic Observations and Macroscopic Cutting Experiments. Agriculture 2024, 14, 1714. [Google Scholar] [CrossRef]
Niedbała, G.; Nowakowski, K.; Rudowicz-nawrocka, J.; Piekutowska, M.; Weres, J.; Tomczak, R.J.; Tyksiński, T.; Álvarez Pinto, A. Multicriteria Prediction and Simulation of Winter Wheat Yield Using Extended Qualitative and Quantitative Data Based on Artificial Neural Networks. Appl. Sci. 2019, 9, 2773. [Google Scholar] [CrossRef]
Folberth, C.; Baklanov, A.; Balkovič, J.; Skalský, R.; Khabarov, N.; Obersteiner, M. Spatio-Temporal Downscaling of Gridded Crop Model Yield Estimates Based on Machine Learning. Agric. For. Meteorol. 2019, 264, 1–15. [Google Scholar] [CrossRef]
Matsumura, K.; Gaitan, C.F.; Sugimoto, K.; Cannon, A.J.; Hsieh, W.W. Maize Yield Forecasting by Linear Regression and Artificial Neural Networks in Jilin, China. J. Agric. Sci. 2015, 153, 399–410. [Google Scholar] [CrossRef]
Gopal, P.M.; Bhargavi, R. A Novel Approach for Efficient Crop Yield Prediction. Comput. Electron. Agric. 2019, 165, 104968. [Google Scholar] [CrossRef]
Zhang, Y.; Li, N.; Zhang, L.; Lin, J.; Gao, X.; Chen, G. A Review on the Recent Developments in Vision-Based Apple-Harvesting Robots for Recognizing Fruit and Picking Pose. Comput. Electron. Agric. 2025, 231, 109968. [Google Scholar] [CrossRef]
Zhang, G.; Li, L.; Zhang, Y.; Liang, J.; Chun, C. Citrus Pose Estimation under Complex Orchard Environment for Robotic Harvesting. Eur. J. Agron. 2025, 162, 127418. [Google Scholar] [CrossRef]
Bai, Y.; Mao, S.; Zhou, J.; Zhang, B. Clustered Tomato Detection and Picking Point Location Using Machine Learning-Aided Image Analysis for Automatic Robotic Harvesting. Precis. Agric. 2023, 24, 727–743. [Google Scholar] [CrossRef]
Yu, Y.; Zhang, K.; Liu, H.; Yang, L.; Zhang, D. Real-Time Visual Localization of the Picking Points for a Ridge-Planting Strawberry Harvesting Robot. IEEE Access 2020, 8, 116556–116568. [Google Scholar] [CrossRef]
Chen, X.; Dong, G.; Fan, X.; Xu, Y.; Liu, T.; Zhou, J.; Jiang, H. Fruit Stalk Recognition and Picking Point Localization of New Plums Based on Improved DeepLabv3+. Agriculture 2024, 14, 2120. [Google Scholar] [CrossRef]
Li, L.; Li, K.; He, Z.; Li, H.; Cui, Y. Kiwifruit Segmentation and Identification of Picking Point on Its Stem in Orchards. Comput. Electron. Agric. 2025, 229, 109748. [Google Scholar] [CrossRef]
Zarbakhsh, S.; Shahsavar, A.R. Artificial Neural Network-Based Model to Predict the Effect of γ-Aminobutyric Acid on Salinity and Drought Responsive Morphological Traits in Pomegranate. Sci. Rep. 2022, 12, 16662. [Google Scholar] [CrossRef] [PubMed]
Zhan, D.; Mu, Y.; Duan, W.; Ye, M.; Song, Y.; Song, Z.; Yao, K.; Sun, D.; Ding, Z. Spatial Prediction and Mapping of Soil Water Content by TPE-GBDT Model in Chinese Coastal Delta Farmland with Sentinel-2 Remote Sensing Data. Agriculture 2023, 13, 1088. [Google Scholar] [CrossRef]
Abdikan, S.; Sekertekin, A.; Narin, O.G.; Delen, A.; Balik Sanli, F. A Comparative Analysis of SLR, MLR, ANN, XGBoost and CNN for Crop Height Estimation of Sunflower Using Sentinel-1 and Sentinel-2. Adv. Space Res. 2023, 71, 3045–3059. [Google Scholar] [CrossRef]
Yamparla, R.; Shaik, H.S.; Guntaka, N.S.P.; Marri, P.; Nallamothu, S. Crop Yield Prediction Using Random Forest Algorithm. In Proceedings of the 2022 7th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 22–24 June 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1538–1543. [Google Scholar]
Rustler, L.; Volprecht, V.; Hoffmann, M. Empirical Comparison of Four Stereoscopic Depth Sensing Cameras for Robotics Applications. IEEE Access 2025, 13, 67564–67577. [Google Scholar] [CrossRef]
Huang, C.; Qin, Z.; Hua, X.; Zhang, Z.; Xiao, W.; Liang, X.; Song, P.; Yang, W. An Intelligent Analysis Method for 3D Wheat Grain and Ventral Sulcus Traits Based on Structured Light Imaging. Front. Plant Sci. 2022, 13, 840908. [Google Scholar] [CrossRef]
Givi, M.; Cournoyer, L.; Reain, G.; Eves, B.J. Performance Evaluation of a Portable 3D Imaging System. Precis. Eng. 2019, 59, 156–165. [Google Scholar] [CrossRef]
Litavec, H. A Novel Method for Sorting and Reassociating Commingled Human Remains Using Deviation Analysis. J. Forensic Sci. 2023, 68, 1780–1791. [Google Scholar] [CrossRef]
Beleidy, M.; Ziada, A. 3D Surface Deviation Wear Analysis of Veneered PEEK Crowns and Its Correlation with Optical Digital Profilometry. J. Prosthodont. 2023, 32, 32–39. [Google Scholar] [CrossRef]
Panigrahi, B.; Kathala, K.C.R.; Sujatha, M. A Machine Learning-Based Comparative Approach to Predict the Crop Yield Using Supervised Learning With Regression Models. Procedia Comput. Sci. 2023, 218, 2684–2693. [Google Scholar] [CrossRef]
Prajapati, D.K.; Katiyar, J.K.; Prakash, C. Machine Learning Approach for the Prediction of Mixed Lubrication Parameters for Different Surface Topographies of Non-Conformal Rough Contacts. Ind. Lubr. Tribol. 2023, 75, 1022–1030. [Google Scholar] [CrossRef]
Omotehinwa, T.O.; Oyewola, D.O. Hyperparameter Optimization of Ensemble Models for Spam Email Detection. Appl. Sci. 2023, 13, 1971. [Google Scholar] [CrossRef]
Leng, L.; Zhang, W.; Chen, Q.; Zhou, J.; Peng, H.; Zhan, H.; Li, H. Machine Learning Prediction of Nitrogen Heterocycles in Bio-Oil Produced from Hydrothermal Liquefaction of Biomass. Bioresour. Technol. 2022, 362, 127791. [Google Scholar] [CrossRef]
Majnarić, D.; Šegota, S.B.; Lorencin, I.; Car, Z. Prediction of Main Particulars of Container Ships Using Artificial Intelligence Algorithms. Ocean Eng. 2022, 265, 112571. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar]
Hao, W.; Sun, J.; Zhang, Z.; Zhang, K.; Qiu, F.; Xu, J. Novel Hybrid Model to Estimate Leaf Carotenoids Using Multilayer Perceptron and PROSPECT Simulations. Remote Sens. 2023, 15, 4997. [Google Scholar] [CrossRef]
Sbai, Z. Deep Learning Models and Their Ensembles for Robust Agricultural Yield Prediction in Saudi Arabia. Sustainability 2025, 17, 5807. [Google Scholar] [CrossRef]
Gaffoor, Z.; Pietersen, K.; Jovanovic, N.; Bagula, A.; Kanyerere, T.; Ajayi, O.; Wanangwa, G. A Comparison of Ensemble and Deep Learning Algorithms to Model Groundwater Levels in a Data-Scarce Aquifer of Southern Africa. Hydrology 2022, 9, 125. [Google Scholar] [CrossRef]
Rizkallah, L.W. Enhancing the Performance of Gradient Boosting Trees on Regression Problems. J. Big Data 2025, 12, 35. [Google Scholar] [CrossRef]
Vafaeinejad, A.; Sharifi, A.; Khan, S.N. Robust County-Level Corn Yield Estimation Using Ensemble Machine Learning and Multi-Source Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 16942–16953. [Google Scholar] [CrossRef]
Elbasi, E.; Mostafa, N.; Zaki, C.; AlArnaout, Z.; Topcu, A.E.; Saker, L. Optimizing Agricultural Data Analysis Techniques through AI-Powered Decision-Making Processes. Appl. Sci. 2024, 14, 8018. [Google Scholar] [CrossRef]
Jeong, J.H.; Resop, J.P.; Mueller, N.D.; Fleisher, D.H.; Yun, K.; Butler, E.E.; Timlin, D.J.; Shim, K.-M.; Gerber, J.S.; Reddy, V.R.; et al. Random Forests for Global and Regional Crop Yield Predictions. PLoS ONE 2016, 11, e0156571. [Google Scholar] [CrossRef] [PubMed]
Sandhu, K.S.; Lozada, D.N.; Zhang, Z.; Pumphrey, M.O.; Carter, A.H. Deep Learning for Predicting Complex Traits in Spring Wheat Breeding Program. Front. Plant Sci. 2021, 11, 613325. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Li, Z.; Gao, G.; Wang, Y.; Zhao, C.; Bai, H.; Lv, Y.; Zhang, X.; Li, Q. BerryNet-Lite: A Lightweight Convolutional Neural Network for Strawberry Disease Identification. Agriculture 2024, 14, 665. [Google Scholar] [CrossRef]
Ren, J.; Zhou, H.; Tao, Z.; Ge, L.; Song, K.; Xu, S.; Li, Y.; Zhang, L.; Zhang, X.; Li, S. Long-Term Monitoring Chlorophyll-a Concentration Using HJ-1 A/B Imagery and Machine Learning Algorithms in Typical Lakes, a Cold Semi-Arid Region. Opt. Express 2024, 32, 16371. [Google Scholar] [CrossRef]
Hosseini, F.S.; Jafari, A.; Zandi, I.; Alesheikh, A.A.; Rezaie, F. Groundwater Potential Mapping Using Optimized Decision Tree-Based Ensemble Learning Model with Local and Global Explainability. Water 2025, 17, 1520. [Google Scholar] [CrossRef]
Çifci, A.; Kırbaş, İ. Fusion of Machine Learning and Explainable AI for Enhanced Rice Classification: A Case Study on Cammeo and Osmancik Species. Eur. Food Res. Technol. 2025, 251, 69–86. [Google Scholar] [CrossRef]
Wu, G.; Li, B.; Zhu, Q.; Huang, M.; Guo, Y. Using Color and 3D Geometry Features to Segment Fruit Point Cloud and Improve Fruit Recognition Accuracy. Comput. Electron. Agric. 2020, 174, 105475. [Google Scholar] [CrossRef]
Lin, G.; Tang, Y.; Zou, X.; Xiong, J.; Li, J. Guava Detection and Pose Estimation Using a Low-Cost RGB-D Sensor in the Field. Sensors 2019, 19, 428. [Google Scholar] [CrossRef]
Gao, J.; Zhang, F.; Zhang, J.; Yuan, T.; Yin, J.; Guo, H.; Yang, C. Development and Evaluation of a Pneumatic Finger-like End-Effector for Cherry Tomato Harvesting Robot in Greenhouse. Comput. Electron. Agric. 2022, 197, 106879. [Google Scholar] [CrossRef]
Liu, J.; Liu, Z. The Vision-Based Target Recognition, Localization, and Control for Harvesting Robots: A Review. Int. J. Precis. Eng. Manuf. 2024, 25, 409–428. [Google Scholar] [CrossRef]
Xiao, X.; Wang, Y.; Jiang, Y. Review of Research Advances in Fruit and Vegetable Harvesting Robots. J. Electr. Eng. Technol. 2024, 19, 773–789. [Google Scholar] [CrossRef]
Zhang, J.; Kang, N.; Qu, Q.; Zhou, L.; Zhang, H. Automatic Fruit Picking Technology: A Comprehensive Review of Research Advances. Artif. Intell. Rev. 2024, 57, 54. [Google Scholar] [CrossRef]
Lamessa, K. Performance Evaluation of Banana Varieties, through Farmer’s Participatory Selection. Int. J. Fruit Sci. 2021, 21, 768–778. [Google Scholar] [CrossRef]
Onyia, T.M.; Olarinoye, I.A.A.; Jimoh, S.A. Advancements and Challenges in 3D Scanning: A Comprehensive Review of Engineering Applications. Afr. J. Adv. Sci. Technol. Res. 2025, 18, 191–206. [Google Scholar] [CrossRef]
Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Description of the positions and regions of the crown: (a) banana bunch; (b) seven crowns arranged from apex to base; (c) different regions of the crown.

Figure 2. Schematic workflow for modeling banana crown de-handing points using machine learning techniques.

Figure 3. Three-dimensional model scanning of the banana crown: (a) marked crown; (b) handheld scanner; (c) original point cloud of crown.

Figure 4. Extraction and measurement of crown morphological features: (a) inner arc and outer arc of the crown; (b) target arc of the crown.

Figure 5. Hyperparameter optimization for MLP-based models: (a) R_t; (b) D_ti.

Figure 6. Hyperparameter optimization for GBDT-based models: (a) R_t; (b) D_ti.

Figure 7. Hyperparameter optimization for XGBoost-based models: (a) R_t; (b) D_ti.

Figure 8. Hyperparameter optimization for RF-based models: (a) R_t; (b) D_ti.

Figure 9. Actual versus predicted R_t values for (a) MLP, (b) GBDT, (c) XGBoost, and (d) RF models. Dashed lines represent y = x references line.

Figure 10. Actual versus predicted D_ti values for (a) MLP, (b) GBDT, (c) XGBoost, and (d) RF models. Dashed lines represent y = x reference line.

Figure 11. Interpretability analysis of RF-based R_t model: (a) feature importance ranking; (b) SHAP value distributions; (c) partial dependence profiles.

Figure 12. Interpretability analysis of RF-based D_ti model: (a) feature importance ranking; (b) SHAP value distributions; (c) partial dependence profiles.

Table 1. Main parameters of the handheld scanner.

Specifications	Parameters	Specifications	Parameters
Measuring rate	550,000 measurements/s	Resolution	0.50 mm
Scanning area	380 mm × 380 mm	Accuracy	0.10 mm
Positioning methods	Geometry and/or color and/or targets	Volumetric accuracy	0.30 mm/m
Light source	White light (LED)	Stand-off distance	400 mm
Texture color (resolution)	24 bits (50 to 150 DPI)	Depth of field	250 mm

Table 2. Hyperparameters of MLP, GBDT, XGBoost, and RF algorithms.

Parameters	MLP	GBDT	XGBoot	RF
learning_rate	0.01–0.10	0.01–0.10	0.01–0.10	/
hidden_layer_sizes	30–100	/	/	/
max_depth	/	1–10	1–10	/
n_estimators	/	/	/	1–20
max_features	/	/	/	1–7

Table 3. Statistical results of crown morphological features.

Crown Morphological Features	Maximum Value	Minimum Value	Mean Value	Standard Deviation
Inner arc radius (R_i, mm)	46.57	22.51	31.69	5.42
Inner arc center angle (α_i, °)	240.77	145.44	178.47	15.11
Inner arc length (L_i, mm)	161.79	66.02	98.94	20.56
Outer arc radius (R_o, mm)	83.07	30.06	47.93	10.88
Outer arc center angle (α_o, °)	242.60	117.00	177.03	21.74
Outer arc length (L_o, mm)	229.76	100.64	146.34	29.21
Distance between inner and outer arc (D_oi, mm)	61.13	17.65	32.96	9.60
Target radius (R_t, mm)	64.50	29.83	41.61	7.68
Target distance (D_ti, mm)	34.24	9.33	18.95	5.51

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, L.; Yang, Z.; Wang, C.; Jin, M.; Duan, J. Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning. Agronomy 2025, 15, 1880. https://doi.org/10.3390/agronomy15081880

AMA Style

Zhao L, Yang Z, Wang C, Jin M, Duan J. Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning. Agronomy. 2025; 15(8):1880. https://doi.org/10.3390/agronomy15081880

Chicago/Turabian Style

Zhao, Lei, Zhou Yang, Chunxia Wang, Mohui Jin, and Jieli Duan. 2025. "Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning" Agronomy 15, no. 8: 1880. https://doi.org/10.3390/agronomy15081880

APA Style

Zhao, L., Yang, Z., Wang, C., Jin, M., & Duan, J. (2025). Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning. Agronomy, 15(8), 1880. https://doi.org/10.3390/agronomy15081880

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. A Systematic Workflow for Modeling Banana Crown De-Handing

2.2. Banana Crown Morphology Data Collection

2.2.1. Crown Sample Preparation

2.2.2. Extraction and Measurement of Crown Morphological Features

2.3. Machine Learning Algorithms and Hyperparameter Tuning

2.4. De-Handing Point Model Training and Evaluation

2.5. De-Handing Point Model Interpretation

3. Results

3.1. Banana Crown Morphometric Measurement Data

3.2. Hyperparameter Optimization for De-Handing Point Models

3.2.1. MLP-Based Models

3.2.2. GBDT-Based and XGBoost-Based Models

3.2.3. RF-Based Models

3.3. Performance Evaluation of De-Handing Point Models

3.3.1. Rt Models

3.3.2. Dti Models

3.4. Interpretation of De-Handing Point Models

3.4.1. Feature Importance Analysis of RF-Based Rt Model

3.4.2. Feature Importance Analysis of Dti Model

4. Discussion

4.1. Morphological Drivers Governing De-Handing Location

4.2. Engineering Specifications for Automated De-Handing Systems

4.3. Limitations and Future Works

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3.1. R_t Models

3.3.2. D_ti Models

3.4.1. Feature Importance Analysis of RF-Based R_t Model

3.4.2. Feature Importance Analysis of D_ti Model