Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes

Yerzati, Yerhazi; Xia, Qiuhao; Luo, Langqin; Chen, Jiaxing; Qi, Jiahui; Guo, Zhongzhong; Zhai, Changyuan; Zhang, Yunqi; Zhang, Rui

doi:10.3390/rs18101449

Open AccessArticle

Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes

by

Yerhazi Yerzati

^1,2,3,†,

Qiuhao Xia

^1,2,3,†,

Langqin Luo

^1,2,3,

Jiaxing Chen

^1,2,3,

Jiahui Qi

^1,2,3,

Zhongzhong Guo

^1,3,

Changyuan Zhai

⁴

,

Yunqi Zhang

⁵

and

Rui Zhang

^2,3,*

¹

College of Horticulture and Forestry, Tarim University, Alar 843300, China

²

State Local Joint Engineering Laboratory of High-Efficiency and High-Quality Cultivation and Deep-Processing Technology of Specialty Fruit Trees in South Xinjiang, Alar 843300, China

³

Corps South Xinjiang Characteristic Forest and Fruit Technology Innovation Center, Alar 843300, China

⁴

Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

⁵

Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Remote Sens. 2026, 18(10), 1449; https://doi.org/10.3390/rs18101449

Submission received: 16 March 2026 / Revised: 25 April 2026 / Accepted: 26 April 2026 / Published: 7 May 2026

(This article belongs to the Special Issue Applications of Unmanned Aerial Remote Sensing in Precision Agriculture)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Red-edge texture features showed higher correlation coefficients and SHAP importance values than traditional vegetation indices for walnut yield prediction, suggesting they may be more sensitive to canopy structural heterogeneity under varying water and fertilizer regimes.
The proposed growth stage stacking ensemble (GSSE) model in this dataset achieved an R² of 0.789, and the characteristic coefficients suggested that the oil conversion stage had the highest estimated contribution (60%) to the final prediction.

What are the implications of the main findings?

Identifies a growth stage for precise management, with the oil conversion period serving as a window for targeted water and fertilizer interventions to maximize resource efficiency.
Enhances model reliability for decision support, as the high accuracy and interpretability of the approach provide a transparent foundation for intelligent orchard management.

Abstract

To overcome the limitations of traditional yield estimation methods—which are often subjective, costly, and difficult to implement at scale—this study developed a high-precision, interpretable model for predicting walnut yield by integrating multi-source remote sensing technology with interpretable machine learning. To provide a theoretical foundation for precise water and fertilizer management as well as intelligent production in walnut orchards. By employing interpretable machine learning and a multi-stage integration strategy, the model achieves not only high-precision yield estimation but also elucidates the influence pathways of water–fertilizer coupling on yield formation at a mechanistic level. This advancement offers reliable technical support and a decision-making framework for the precise management of orchards. This study focused on the Xinjiang ‘Wen 185’ walnut, employing field experiments with varying water and fertilizer gradients. A UAV equipped with a multispectral sensor was utilized to capture canopy images, from which vegetation indices and texture features were extracted. This process resulted in a comprehensive dataset that integrated remotely sensed features with management practices. Various machine learning algorithms, including random forest, support vector machine, partial least squares regression, and ridge regression, were applied. An innovative stacked integration model for growth stages was proposed, and the SHAP framework was incorporated to analyze feature contributions and enhance model interpretability. In this study, texture features—particularly those derived from the red-edge band—showed higher predictive importance than traditional vegetation indices. This suggests that they may be more sensitive to canopy structural heterogeneity under the tested conditions. Among the models, random forest showed numerically higher values in terms of R² and RPD compared to the other individual models under the present dataset, achieving a validation R² of 0.670 and an RPD of 1.836. The proposed growth stage stacking ensemble (GSSE) model further enhanced prediction accuracy, achieving validation R² of 0.789, an RMSE of 0.494, and an RPD of 2.296. Additionally, the results revealed that texture may have a potential ability to captured canopy heterogeneity as the primary mechanism underlying yield variation, and the integration of multi-stage spectral information was associated with higher estimation accuracy in this dataset in improving estimation accuracy, with the oil conversion stage contributing up to 60% to the final prediction.

Keywords:

walnut yield; interpretable machine learning; stacked integration; multispectral; water-fertilization coupling

1. Introduction

In the context of escalating challenges to global food security and sustainable agricultural development, the precise and intelligent management of fruit tree production has emerged as a central focus of modern agricultural advancement [1]. Walnut (Juglans regia L.) is a significant economic forest fruit and a vital source of nutrition worldwide. The stable development of the walnut industry is crucial for ensuring farmers’ income, fostering regional economic growth, and sustaining ecosystem service functions [2]. However, traditional methods for estimating walnut yield predominantly depend on manual surveys and empirical judgments, which are plagued by substantial subjective bias, delayed timeliness, high costs, and difficulties in achieving rapid assessments at the regional level [3]. Consequently, these limitations represent a bottleneck that hinders refined management and informed decision-making within the walnut industry.

In recent years, the extensive use of remote sensing technology, particularly through unmanned aerial vehicle (UAV) platforms, has established a technical pathway for efficient and non-destructive phenotypic monitoring and yield prediction of fruit trees [4]. Drones equipped with multispectral, hyperspectral, and other sensors can rapidly acquire canopy spectral information that is closely associated with the growth status, biomass, and stress responses of fruit trees [5]. The effective estimation of yields for various fruit tree species can be accomplished by deriving vegetation indices, such as NDVI and NDRE, and integrating them with machine learning algorithms [6,7]. For example, Lee K. et al. [8] collected time series aerial images from unmanned aerial vehicles (UAVs) in rice fields and compared these data with the nitrogen content in rice leaves and rice yield at the experimental sites. Similarly, Sun X. et al. [9] employed spectral and texture features from UAV multispectral images, integrating machine learning modeling methods to suggest the feasibility of estimating corn leaf area index (LAI). These studies have established a methodological foundation for monitoring orchard productivity through remote sensing technology. However, walnut yield estimation presents unique challenges compared with annual crops. Walnut trees are perennial with complex, multi-layered canopies that create significant shading and heterogeneous spectral signals. They exhibit pronounced biennial bearing, where heavy fruiting in one year suppresses yield in the next. Additionally, walnut fruits are irregularly distributed across the canopy and undergo a prolonged development period with distinct phenological stages (shell hardening, oil conversion, maturity), each with different physiological sensitivities. These characteristics limit the direct application of yield prediction models developed for uniform field crops. Notably, the yield of fruit trees results from the interplay among genotypes, environmental factors, and management practices. Among these, water and nutrient management are the two most important controllable agronomic factors, and their combined effect significantly influences the accumulation, distribution, and final yield formation of photosynthetic products in fruit trees [10]. In walnut cultivation, the coordinated regulation of water and fertilizer significantly influences processes such as photosynthetic efficiency and oil synthesis. An appropriate ratio can result in a substantial increase in yield [11]. The concept of ‘water-fertilizer coupling’ refers to the synergistic or antagonistic interactions between water and nutrient availability that jointly affect crop yield. Previous studies have largely relied on factorial field experiments and response surface methodologies to identify optimal irrigation and fertilization combinations [12]. However, most remote sensing-based yield prediction models treat water and fertilizer management as categorical variables rather than quantitatively integrating the coupling mechanism into the prediction framework. Consequently, while water–fertilizer coupling is widely recognized as an important biophysical process, its mechanistic role in UAV-based yield prediction models—particularly for tree crops—remains largely unexplored.

The deep integration of artificial intelligence technology has enabled machine learning algorithms, including Random Forest (RF), Support Vector Machine (SVM), and gradient boosting decision trees such as XGBoost, to exhibit superior performance in yield prediction tasks. This success is attributed to their robust nonlinear fitting capabilities and proficiency in processing high-dimensional data [13,14]. To address the challenge of model interpretability, post-interpretive frameworks grounded in game theory, such as SHAP (SHapley Additive exPlanations), have been incorporated into agricultural modeling. These frameworks quantify the contribution of each feature variable to the prediction results, thereby enhancing the model’s transparency and credibility [15,16]. In recent years, a growing body of evidence has suggested that sole reliance on single-band spectral reflectance and traditional vegetation indices is increasingly inadequate for capturing the spatial heterogeneity of crop canopies and the subtle growth variations induced by differential water and fertilizer management. Texture features-particularly those derived from the red-edge band-have emerged as a potentially more sensitive and structurally informative alternative, as they may reflect within-canopy structural heterogeneity that spectral indices alone cannot resolve. Consequently, this study focused on texture features, recognizing that they showed advantages over spectral indices under the present dataset. To address the dual limitations of existing research in quantifying the coupling mechanism between water and fertilizer and enhancing model interpretability, this study focuses on the primary walnut variety ‘Wen 185’ cultivated in Xinjiang. While recognizing that walnut yield is also influenced by other environmental factors (e.g., temperature, light), this study focuses on the coupled effects of water and fertilizer—two key controllable agronomic factors—and innovatively integrates interpretable machine learning with multi-source unmanned aerial vehicle remote sensing technology to conduct high-precision and interpretable research on walnut yield under water–fertilizer coupling conditions. This study used texture features—especially red-edge texture features based on the hypothesis that they may capture canopy structural heterogeneity related to water–fertilizer responses more effectively than traditional spectral indices alone. By designing rigorous field experiments with varying water and fertilizer gradients, we constructed a comprehensive dataset that integrates multi-source remote sensing features, including spectral and textural data, alongside precise management measures, such as irrigation and fertilizer application volumes.

The study systematically compares and evaluates the predictive performance of various machine learning models and integration strategies. Furthermore, the SHAP interpretability analysis framework is employed to analyze the contribution pathways and internal mechanisms of water and fertilizer factors, as well as their interactions with remote sensing features in relation to yield formation.

This research aims to develop a model for predicting walnut yield under the specific water and fertilizer conditions of this dataset, and to explore the potential for enhancing model transparency and decision-support value using interpretability methods. This approach seeks to provide both theoretical and technical support for precise water and fertilizer management, efficient resource utilization, and intelligent production in walnut orchards.

2. Materials and Methods

2.1. Overview of the Study Area

The walnut trees used in this experiment were sourced from the 16th Company of the Third Regiment, Alar City, First Division of the Xinjiang Production and Construction Corps. The geographical location of the study area is shown in Figure 1a, and the detailed layout of the experimental plots is presented in Figure 1b. The selected variety was ‘Wen 185’, and the trees were 16 years old. The planting pattern was oriented north–south, with a spacing of 5 m × 6 m. The soil in the garden plot was sandy loam.

The chemical properties of the soil are as follows: alkali-hydrolyzable nitrogen is 19.32 mg·kg⁻¹, available phosphorus is 32.67 mg·kg⁻¹, available potassium is 77.21 mg·kg⁻¹, the pH value is 7.69, and electrical conductivity is 682.56 μS·cm⁻¹. The test materials include urea (containing N ≥ 46%), potassium sulfate (containing K₂SO₄ ≥ 52%), and monoammonium phosphate (containing N ≥ 13% and P ≥ 27%).

Control: Potassium dihydrogen phosphate (KH₂PO₄ ≥ 99%, P₂O₅ ≥ 52%, K₂O ≥ 34%), compound fertilizer (N-P₂O₅-K₂O ≥ 45%, S ≥ 16%), potassium magnesium sulfate fertilizer (K₂O ≥ 24%, S ≥ 16%, Mg ≥ 6%), and brown algae oligosaccharide urea (N ≥ 45%).

2.2. Experimental Design

During the walnut growth period in 2025, the experiment was conducted in representative parks characterized by flat terrain, uniform soil, consistent tree vigor, and minimal marginal effects, employing a completely randomized design. The experimental plot area measured 60 m × 25 m (1500 m²), with a total of 5 rows, and each row has 10 trees. Therefore, there are a total of 50 trees in the experimental plot. Buffer rows were set between adjacent plots to minimize the edge effect. Irrigation was divided into three stages: spring irrigation (5–25 March, flood irrigation), micro-sprinkler irrigation (1 May–1 September, water-saving irrigation), and winter irrigation (5–10 November, flood irrigation). All other management practices adhered to standard field protocols. The experiment examined two factors: irrigation volume and fertilizer application volume. Three growth stages were defined based on walnut fruit development: the sclerotial stage (S1, late May to mid-June, characterized by rapid endocarp hardening), the oil conversion stage (S2, mid-July to mid-August, characterized by kernel oil accumulation), and the maturity stage (S3, late August to mid-September, characterized by fruit ripening and harvest readiness). Based on the previous year’s experimental results and the predictions from the water and fertilizer coupling model established to optimize the walnut water and fertilizer coupling scheme with respect to yield, quality, and water and fertilizer utilization efficiency [17], two irrigation levels were established: W1 (500 m³·667 m⁻²) and W2 (450 m³·667 m⁻²). The fertilization rates were set at F1 (110.28 kg·667 m⁻²) and F2 (120 kg·667 m⁻²). Additionally, the conventional management practices of local farmers, which included an irrigation volume of 600 m³·667 m⁻² and an empirical fertilization rate of 150 kg·667 m⁻², served as the control (CK). The fertilizer was administered in conjunction with water droplets. The detailed irrigation and fertilization protocols are presented in Table 1 and Table 2.

2.3. Acquisition of UAV Multispectral Images

A DJI M300 RTK quadcopter UAV equipped with an MS600 Pro multispectral camera was used for image acquisition. The UAV is manufactured by Shenzhen Dajiang Innovation Technology Co., Ltd. (DJI) (Shenzhen, China). The MS600 Pro multispectral camera is manufactured by Chang Guang Yu Chen Information Technology and Equipment (Qingdao) Co., Ltd. (Qingdao, China). The UAV has a maximum takeoff weight of 9 kg and a single-battery endurance of up to 55 min. The MS600 Pro multispectral camera features six spectral channels: blue (450 nm), green (550 nm), red (660 nm), red edge 1 (720 nm), red edge 2 (750 nm), and near-infrared (840 nm), with corresponding bandwidths of 30 nm, 27 nm, 22 nm, 10 nm, 15 nm, and 30 nm, respectively. The flight altitude was set at 80 m, yielding a ground spatial resolution of 3.33 cm, with 80% overlap in both heading and side directions. Image acquisition was conducted under clear, stable weather conditions at midday (12:30–14:30) to ensure data quality.

2.4. UAV Image Preprocessing

The single-band image data collected by the multispectral camera are registered, fused, and spliced using Pix4D Mapper software (version 4.5.6), resulting in a TIF format multispectral image that contains six bands. Subsequently, the Stacking Layer function in the ENVI 5.3 software toolbox is employed to combine each band into a complete multispectral image. To emphasize the characteristics of the target fruit trees and minimize background interference, ENVI 5.3 is further utilized to crop the images, as the range of the spliced images extended beyond the boundaries of the experimental area. Individual tree crowns were manually delineated using the Region of Interest (ROI) tool in ENVI 5.3, based on the known planting coordinates. For each delineated tree crown, the average spectral reflectance and texture features were extracted at the individual tree level. These tree-level features were then linked to the corresponding measured yield data for each tree.

2.5. Test Indicators and Methods

2.5.1. Walnut Production

The walnut yield data was collected on 10 September 2025. In each treatment, 10 fruit trees were selected, and all fruits from the trees were harvested for counting and weighing. The average number of fruits per tree and the average weight of individual walnuts were calculated. The yield per tree was estimated by multiplying these two averages, and then the total yield per plot was calculated based on this value.

2.5.2. Extraction of Vegetation Indices

Multispectral imaging serves as a widely utilized method for monitoring and assessing the growth status of walnut trees. This technique encompasses five spectral channels: red (R), green (G), blue (B), red edge (RE), and near-infrared (NIR). By calculating various vegetation indices, it may reflect the growth condition of walnuts. Typically, these indices represent the ratio of spectral reflectance from two or more bands or a linear or nonlinear combination, which effectively emphasizes vegetation information and is extensively employed to derive various vegetation parameters [18,19].

2.5.3. Extraction of Texture Features

Existing studies have suggested that texture information can be utilized to invert physiological and biochemical parameters of fruit trees [20]. Given that texture features—particularly those derived from the red-edge band—offer superior capacity to capture canopy structural heterogeneity under varying water and fertilizer regimes, whereas traditional vegetation indices alone are insufficient for this purpose, this study calculated the texture features across all bands of each data source at the sampling points, as detailed in Table 3. The calculated texture features include mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation. The gray-level co-occurrence matrix (GLCM) method was employed for these calculations, using ENVI 5.3 (Exelis VIS, Boulder, CO, USA). In this analysis, the texture extraction window size was set to 3 × 3, and the grayscale quantization level was established at 64.

2.5.4. Individual Machine Learning Algorithms

This study utilizes four machine learning algorithms to develop a walnut yield estimation model. The specific optimization strategies and parameter settings are as follows: The random forest (RF) employs a phased grid search strategy. In the first stage, coarse adjustments are made to n_estimators (ranging from 100 to 1000, with a step size of 100) and max_depth (ranging from 5 to 50, with a step size of 5). In the second stage, fine adjustments are conducted near the optimal parameters [21]. Support Vector Regression (SVR) applies the RBF kernel function to perform a grid search on the penalty parameter C (ranging from 10⁻⁵ to 10⁵) and the kernel parameter gamma (ranging from 10⁻⁷ to 10⁻³) [22]. Partial Least Squares Regression (PLS) automatically determines the maximum component score based on the number of features and selects the optimal number of components through cross-validation [23]. The ridge regression model parameters consist of the regularization coefficient α (search range: 10⁻⁶ to 10⁶) and the solver, which are optimized using random search. These parameters are established through grid search within a cross-validation framework, with the goal of balancing the model’s fitting capability and generalization performance. All models underwent 5-fold cross-validation for hyperparameter tuning, and R², RMSE, and RPD were consistently employed as evaluation metrics to ensure objective model comparisons and result comparability.

2.5.5. Stacked Integration Learning

Stacked ensemble learning (SEL) represents an effective strategy for ensemble learning [24]. This study employs random forest (RF), support vector regression (SVR), partial least squares regression (PLS), and ridge regression (RR) as the foundational models, with RR designated as the meta-model to establish the SEL framework. To enhance the model’s generalization capability and mitigate overfitting resulting from data leakage, we implement five-fold cross-validation to produce an out-of-sample prediction matrix (OSPM) that serves as the input feature for the meta-model. The specific generation steps are as follows: First, the original training set is divided into five mutually exclusive subsets. Next, each subset is sequentially designated as the validation set, while the remaining four subsets serve as the training set to train the basic model and generate out-of-sample predictions on the validation set. These prediction results are then concatenated to form a complete Out-of-Sample Prediction Model (OSPM). During the model construction phase, four basic models were trained for each growth period, resulting in the generation of the corresponding OSPM. Subsequently, the OSPM was utilized as an input feature for the Ridge Regression (RR) meta-model, which was trained by integrating the prediction results from the basic models, thereby enabling accurate estimation of walnut yield. This stacked ensemble learning approach effectively harnesses the strengths of multiple basic models through meta-models, significantly improving prediction performance.

2.5.6. Growth Stage Stacking Integration

This study introduces an innovative walnut yield estimation model, named GSSE, based on the GSS method proposed by Hassan et al. (2022) [25]. The model seeks to significantly improve the accuracy of walnut yield predictions by integrating the optimal machine learning algorithms for each growth stage with multi-stage spectral features. The specific calculation process comprises several steps. First, four algorithms—Random Forest (RF), Support Vector Regression (SVR), Partial Least Squares regression (PLS), and ridge regression (RR)—are employed to independently estimate walnut yield at three growth stages. Next, the optimal model for each stage is identified through five-fold cross-validation. Additionally, an out-of-sample prediction matrix (OSPM) is constructed to more effectively extract spectral information at each stage. Subsequently, the OSPM outputs from the three optimal models at different growth stages are combined in various ways to produce seven distinct feature combinations. By analyzing the temporal correlations among spectral features across stages, the model’s capacity to represent the dynamic process of yield formation is enhanced. Finally, these seven stacking results serve as input features, which are integrated into the framework utilizing RR as the secondary model to finalize the yield prediction. RR effectively mitigates the instability issues arising from the strong correlation and multicollinearity among the predictions of the base model by implementing L2 regularization constraints on the regression coefficients. This approach improves the model’s generalization performance. Furthermore, it successfully combines ensemble learning with spectral features from multiple growth periods, offering a robust method for attaining more precise estimates of walnut yield. The detailed process is illustrated in Figure 2.

2.5.7. Validation and Analysis

To assess the model’s performance and stability, all 150 samples were randomly allocated into 120 training sets and 30 validation sets at a 4:1 ratio. Three indicators were utilized for validation: the coefficient of determination (R²), root mean square error (RMSE), and residual prediction bias (RPD) [26]. To address the issue of multiple comparisons in correlation analysis (Section 3.1.1), the Benjamini–Hochberg false discovery rate (FDR) procedure was applied to control the expected proportion of Type I errors. Specifically, raw p-values were adjusted to q-values, and only features with q < 0.05 were considered statistically significant. Raw p-values are reported for transparency, but all interpretations of statistical significance are based on FDR-corrected thresholds. A coefficient of determination closer to 1 and a smaller root mean square error indicate superior model performance. A residual prediction bias value ranging from 1.4 to 2.0 suggests that the model quality is adequate for estimating the target variable. A value exceeding 2 signifies excellent estimation capability [27]. The specific calculation formulas are presented in Equations (1)–(3).

R^{2} = \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - \bar{y})}^{2}}{\sum_{i = 1}^{n} {(\bar{y} - {\hat{y}}_{i})}^{2}}

(1)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{n}}

(2)

R P D = \frac{S D_{p}}{R M S E}

(3)

where y_i,

{\hat{y}}_{i}

,

\bar{y}

were the measured, estimated and mean values of walnut yield, respectively, SD_p is the standard deviation of the measured values of the validation set, and n is the sample size.

2.5.8. Analysis of SHAP Feature Contributions

The interpretability of machine learning models is often limited because these models function as black boxes, making it challenging for users to comprehend their internal decision-making processes. The foundational concept of SHAP is derived from the Shapley value in cooperative game theory, which aims to fairly allocate the benefits generated by multiple participants in collaboration. SHAP applies this principle to the interpretation of machine learning models, enabling the calculation of each feature’s contribution to the model’s predictive outcomes [28].

2.6. Data-Processing Software

MATLAB R2023a and python 3.12 software were used to preprocess the spectral information, extract the characteristic bands, etc., and finally, the mathematical model between NIR spectra and chemical content was established, and chemical and spectral averages were computed using Excel 2016, while plotting was done using Origin 2021.

3. Results

3.1. Correlation of Vegetation Indices, Texture Characteristics with Yield and Selection of Model Input Variables

3.1.1. Correlation of Vegetation Indices, Texture Characteristics and Yield

This correlation analysis is strictly exploratory, as no specific a priori hypotheses about individual features were formulated. Its purpose is only to provide an initial descriptive overview, not to rank or select predictors for modeling. Through correlation analysis of 67 remote sensing and spectral features in relation to fruit tree yields (Figure 3), texture features had higher correlation coefficients than traditional vegetation indices. Notably, the Contrast (Redge 720-Contrast, r = 0.305, p < 0.001) and Variance (Redge 720-Variance, r = 0.305, p < 0.001) of the red edge 720 band, along with heterogeneity (Redge-720-Dissimilarity, r = 0.300, p < 0.001), showed somewhat higher numerical values compared to other tested features. However, it is emphasized that all absolute correlation coefficients were below 0.4, suggesting weak-to-moderate linear associations at best. Statistical significance (even after FDR correction, q < 0.05) does not imply practical or predictive importance, especially given the small effect sizes. These weak correlations mean that no single variable can explain a substantial portion of yield variability in a linear sense. After applying the Benjamini–Hochberg false discovery rate (FDR) correction (q < 0.05), these correlations remained statistically significant. Nevertheless, significance alone is not a meaningful guide for feature selection or importance ranking. The practical value of any feature for yield prediction must be evaluated through predictive modeling (e.g., machine learning performance and SHAP importance), not through correlation coefficients. These texture features capture the spatial heterogeneity of canopy structure-a manifestation of differential water and fertilizer effects—that cannot be adequately represented by spectral indices alone. Furthermore, texture features from the green band and the near-infrared band, such as G-Contrast and Nir-Dissimilarity, showed positive correlations (p < 0.05). The homogeneity characteristics of multiple bands, such as Redge-720-Homogeneity (r = −0.219, p = 0.007), showed a negative correlation with yield. All statistically significant correlations had absolute values |r| < 0.4.

3.1.2. Selection of Model Variables

Consistent with the exploratory nature of the correlation analysis, no variable was selected or retained solely based on its correlation coefficient. Instead, all 67 features were initially considered, and feature importance was subsequently determined by the machine learning models themselves (e.g., via SHAP values) and cross-validated predictive performance (Figure 4). Among the 67 features, three showed the highest correlation coefficients with yield: Redge-720 Variance (r = 0.305, p < 0.001), Redge-720 Contrast (r = 0.305, p < 0.001), and Redge-720 Dissimilarity (r = 0.300, p < 0.001). All three are texture features derived from the red-edge band (720 nm). These weak linear associations do not justify any claim of predictive strength. Their selection as input variables for subsequent modeling was based on a comparative statistical screening within this specific dataset—acknowledging the exploratory context—not on any assumption that higher correlation implies better prediction. The limited individual explanatory power of these features necessitates their integration with other variables in a machine learning framework to ensure model stability and to capture potential nonlinear interactions that correlation analysis alone cannot reveal. Ultimately, the importance of each feature for yield estimation was established through SHAP analysis and model performance metrics (Section 3.2 and Section 3.3), not through correlation coefficients. No direct relationship between effect size (r value) and predictive importance is claimed or implied.

3.2. Results of Yield Estimation and Accuracy Analysis

3.2.1. Estimation Results and Accuracy Analysis Using a Single Machine Learning Model

The predictive performances of four models—Partial Least Squares Regression (PLSR), Support Vector Machine (SVM), Random Forest (RF), and Ridge Regression (RR)—were systematically evaluated (Table 4 and Figure 5). Under the present dataset, the Random Forest (RF) model showed higher numerical values for R² (0.688) and RPD (1.813) in the S1 task compared to the other models in that task. Ridge Regression (RR) achieved an R² of 0.798 and an RPD of 2.256 in the S2 sample set. The PLSR and SVM models had lower R² and RPD values than RF and RR in most scenarios.

3.2.2. SEL Estimation Results and Accuracy Analysis

In the performance evaluation of the validation set, the SEL method achieved a validation R² of 0.681 for the S1 sample set, compared with 0.670 for RF. For the S2 and S3 sample sets, SEL’s R² values were 0.6163 and 0.5871, respectively, which were slightly higher than those of some individual models. Therefore, the results suggest that SEL and RF exhibited comparable predictive performance on this dataset, with no substantial differences observed. A one-way analysis of variance (ANOVA) was performed on the estimation results of the RF, SVM, PLSR, and RR models across three growth periods to assess the average differences between SEL and the individual models. Based on the ANOVA results (Figure 6D), SEL did not achieve statistically significantly higher prediction accuracy than the best individual models.

3.2.3. Estimation Results and Precision Analysis of GSSE

The GSSE model was employed to estimate walnut yield, with results illustrated in the Figure 7. The accuracy of the GSSE yield estimation showed higher values (e.g., R², lower RMSE) compared to that of the corresponding single-growth period yield estimation under the present dataset. For example, the R² value for S1S2 is 0.7893, the RMSE is 0.4940, and the RPD is 2.2963. In comparison to the optimal random forest (RF) model during the hard shell stage, the R² value was numerically higher by 0.1193, the RMSE was numerically lower by 0.074, and the RPD was numerically higher by 0.4603 under the present dataset. To assess the stability and accuracy advantages of the GSSE model, the prediction results of the optimal GSSE model were compared with those of the best individual machine learning model and the SEL model (Table 5). The maximum R² achieved was 0.7893, which was numerically 0.119 higher than that observed for the best single machine learning model under the present dataset and numerically 0.108 higher than that observed for the best SEL model. The S1S2 combination had the highest R² (0.7893), with contributions of 20.4% from S1 and 79.6% from S2 as estimated by the meta-model coefficients. The S1S3 combination had an R² of 0.7171, an RMSE of 0.6945, and an RPD of 1.9818. The S1S2S3 combination (three growth periods) had an R² of 0.5061. To further assess the robustness of the GSSE model against potential data leakage due to repeated measurements per tree, an additional validation was performed by splitting the dataset at the tree level (i.e., all three growth-stage observations of a given tree were kept together either entirely in the training set or entirely in the validation set). Under this more stringent split, the GSSE model still achieved a validation R² of 0.852, RMSE of 0.560, and RPD of 2.738, which remain numerically superior to those of the best single-stage model (RF) and the conventional SEL model, confirming the consistent advantage of the proposed GSSE approach.

3.3. Analysis of the Impact of Information on GSSE by Reproductive Period

In the analysis of the characteristic coefficients of the secondary model RR in GSSE (Figure 8), in all two-phase combinations, the S3 phase usually contributes more to yield estimation than the early phase, and S2 contributes more when S1 is combined with S2, and similarly S2 contributes more when S2 is combined with S3, so that the grease transformation phase contributes more than the sclerotia and maturity phases. The grease transformation period also maintained the highest characteristic coefficients when the three fertility periods were combined, suggesting its relatively larger contribution in the model under this dataset, with the grease transformation period contributing the most on average at 60%, followed by the sclerotia period at 35% and the maturity period at 5%.

3.4. Spatial Distribution Map of Fruit Yield

In order to apply the trained GSSE to the orchard scale, the aggregated spectral reflectance data of the two growth stages S1S2 are used as input features to predict the fruit tree yield in the orchard and generate the final yield distribution map (Figure 9), which has a strong generalization ability to the new dataset, and can be effectively applied to the yield estimation at the fruit tree scale.

4. Discussion

4.1. Mechanisms for the Contribution of Remote Sensing Features to Yield Estimation

This study employed correlation analysis and assessed the significance of SHAP features to explore the potential role of texture features—such as variance, contrast, and dissimilarity—in yield estimation, particularly within the red-edge band at 720 nm. In this study, texture features showed higher correlation coefficients with yield than traditional vegetation indices, and they had higher SHAP importance values in the machine learning models. This suggests that texture features may be more sensitive than vegetation indices to the spatial heterogeneity of canopy structure induced by differential water and fertilizer management. In contrast to traditional vegetation indices, texture features may have a potential ability to capture the spatial heterogeneity of the canopy and reflect subtle variations in plant population structure and biomass distribution. This finding aligns with the conclusions of Marcone A. et al. [29], which highlight a potential contribution of texture information in cotton yield studies, suggesting that structural features may contribute to yield estimation in complex backgrounds, although the correlation strengths observed in our study were weak to moderate. Mechanistically, water and fertilizer stress typically induce localized structural changes within the canopy—such as leaf angle alterations, wilting, and uneven biomass accumulation—rather than uniform spectral shifts across the entire canopy. Texture features, particularly those derived from the high-sensitivity red-edge band, may be suited to detect these localized structural anomalies, whereas traditional vegetation indices aggregate spectral information across the canopy and consequently dilute such localized signals. This mechanistic consideration suggests that texture features may have advantages over spectral indices for yield estimation in tree crops with complex canopies. This mechanistic advantage suggests that texture features could be considered a valuable analytical entry point in modern crop yield estimation research under similar conditions. The low correlation coefficients observed between traditional vegetation indices (e.g., NDVI, EVI, GNDVI) and walnut yield may be attributed to several factors. First, these indices primarily respond to green biomass and canopy cover, which tend to saturate at moderate to high leaf area index values—a condition commonly observed in mature walnut orchards. Second, such indices integrate spectral information across the entire canopy, thereby diluting subtle within-canopy structural variations induced by water and fertilizer coupling. Third, during yield formation stages (e.g., oil conversion), changes in canopy architecture—such as leaf angle and fruit distribution—may not be captured by simple band ratios, whereas texture features are more sensitive to such localized heterogeneity. These observations are consistent with previous studies reporting that vegetation indices alone may have limited predictive power for yield in tree crops with complex canopy structures. In the present dataset, texture-based models showed higher prediction accuracy than models using only vegetation indices, suggesting that texture features may better capture heterogeneous canopy responses under varying water and fertilizer regimes, as water and nutrient stress often manifest as localized structural changes rather than uniform spectral shifts. Notably, while previous studies have successfully applied vegetation indices such as NDVI and NDRE for yield prediction in cereal crops [30,31], our findings reveal that such indices lack sufficient sensitivity to capture the spatially heterogeneous canopy responses induced by differential water and fertilizer management in orchard systems. This distinction highlights the necessity of incorporating structural features when modeling yield in tree crops, where within-canopy variability is more pronounced than in uniform field crops. In agricultural production, the texture information acquired from drones may help identify areas with uneven canopy development, thereby facilitating variable fertilization and precise irrigation. This approach could enhance the relevance and effectiveness of water and fertilizer management under appropriate conditions. Furthermore, the texture features of the green light band exhibit predictive capabilities, suggesting that the visible light band may be useful in assessing the distribution of photosynthetically active radiation within the canopy. Despite these advantages, several limitations should be acknowledged. First, texture feature extraction is highly sensitive to image resolution, window size, and grayscale quantization levels, and the optimal parameter settings may vary across growth stages and orchard conditions. Second, the computational cost of generating texture features is substantially higher than that of vegetation indices, which may pose challenges for real-time or large-scale applications. Third, the performance of texture features may degrade under suboptimal image acquisition conditions, such as variable illumination or sensor noise. Future research should focus on developing adaptive parameter optimization strategies and exploring the integration of texture features with deep learning-based feature extraction methods to further enhance robustness and generalizability.

4.2. Machine Learning Model Performance Comparison and Integration Strategy Advantages

Among the single models, Random Forest (RF) showed numerically higher R² and RPD values compared with Partial Least Squares Regression (PLSR), Support Vector Machine (SVM), and Ridge Regression (RR) models in the validation set (Table 4). However, these observed differences were not subjected to formal statistical significance tests (e.g., paired t-tests or confidence intervals) due to the limited sample size and exploratory nature of the model comparison. The relatively higher performance of RF is consistent with its known robustness and anti-overfitting capabilities in processing high-dimensional nonlinear remote sensing data [32]. In comparison, studies employing PLSR for yield prediction often rely on linear assumptions that may be insufficient for capturing the complex interactions between water–fertilizer coupling and canopy spectral responses [33], whereas the nonlinear nature of RF better accommodates such complexities. From the perspective of agricultural production, the reliable performance of RF models may serve as an effective tool for rapid field yield estimation, facilitating the prediction of yield trends at stages and providing essential decision support for harvesting, storage, and sales strategies. Fluctuations in the cross-stage generalization of a single model are evident, particularly during the S3 (maturity stage), when the performance of all models typically declines. This decline may be attributed to signal confusion arising from factors such as canopy spectral saturation and fruit harvesting at this stage. Similar phenomena have been documented in various remote sensing monitoring studies of fruit trees [34]. In practical applications, it is essential to select or develop yield estimation models that align with phenological stages to improve the accuracy of full-process monitoring.

Stacked ensemble learning (SEL) did not yield a statistically significant improvement in prediction accuracy compared with the best individual model (RF). As shown in Section 3.2.2, the R² differences between SEL and RF were minimal (≤0.011) and paired t-tests confirmed no significance (p > 0.05). Therefore, SEL should not be considered more accurate than RF [30]. No statistical evidence supports any advantage of SEL over RF in terms of stability or any other metric; the two models performed comparably within the limits of this dataset. In the present dataset, the GSSE model in the S1S2 combination achieved an R² of 0.7893 and an RPD of 2.29, which were numerically higher than those observed for all individual models and SEL under the present dataset. This finding suggests that combining spectral information from multiple growth stages was associated with numerically higher yield estimation accuracy under the present dataset compared with using a single growth stage. Specifically, the collaborative modeling of the hard shell period and the oil transformation period effectively captures the physiological processes involved in yield formation. This aligns with the research conclusion that time series remote sensing features hold significant value in crop yield prediction [35]. Unlike previous ensemble approaches that simply combine predictions from multiple models without considering phenological dynamics, our GSSE framework explicitly accounts for the temporal sequence of growth stages, allowing the model to leverage stage-specific physiological sensitivities. This design represents a methodological advancement over conventional stacking strategies. This integration strategy offers a methodological framework for developing an agricultural condition monitoring system tailored to various growth stages, thereby facilitating the dynamic optimization and regulation of orchard production management.

It should be explicitly noted that no formal statistical significance tests (e.g., paired t-tests) were conducted. All model comparisons are descriptive, based solely on numerical values of R², RMSE, and RPD under the present dataset.

4.3. Interpretability Analysis of Water–Fertilizer Coupling Effects

Through SHAP analysis and the examination of secondary regression coefficients within the GSSE model, this study explores associations between water and fertilizer management practices and remote sensing responses within the current dataset. The characteristic coefficients of the secondary model in GSSE suggested that the oil transformation period (S2) had the highest contribution (estimated at 60%) among the three growth stages, followed by the hard shell stage (S1, 35%) and the maturity stage (S3, 5%). This observed pattern may reflect the sensitivity of walnut fruit development to water and fertilizer conditions during oil accumulation, consistent with previous studies [36]. regarding the coupling regulation of oil synthesis by water and fertilizer. Compared with related studies that have focused primarily on the overall growing season without distinguishing stage-specific sensitivities, our approach provides a more nuanced understanding of when water and fertilizer interventions exert the greatest influence on yield formation. Consequently, farmers can enhance production and efficiency through precise management during this period, based on the observed contributions in this dataset. Numerous studies have confirmed the synergistic effect of water and nutrients on the physiological metabolism and final yield of fruit trees, particularly during the period of fruit quality formation [37]. Furthermore, red-edge texture features ranked among the top indicators in SHAP analysis, suggesting that they may respond to changes in canopy microstructure related to nitrogen and water status. These features could be further explored for integrated water–fertilizer monitoring. Such findings may provide a theoretical foundation for developing remote sensing-based water and fertilizer diagnostic technologies and for promoting intelligent decision-making in the integration of water and fertilizer.

4.4. Prospects and Limitations of Model Application

The GSSE model developed in this study suggests good spatial generalization capability within the tested orchard conditions. The orchard-scale yield distribution map produced effectively illustrates the spatial heterogeneity of individual plant yields, thereby providing a foundation for informed decision-making regarding precise fertilization and zonal irrigation. This model can be integrated into an intelligent orchard management platform, facilitating the seamless incorporation of yield predictions and variable operations, which aids in minimizing resource waste and enhancing management efficiency.

In terms of model applicability, the GSSE framework is particularly well-suited for orchards with access to UAV-based multispectral imagery and where water and fertilizer management are important factors in yield variability. The model’s reliance on red-edge texture features makes it most effective in systems where canopy structural heterogeneity is pronounced, such as in mature walnut orchards with varying irrigation and fertilization regimes. However, the applicability of the model under different conditions warrants careful consideration. For orchards with uniform canopy structure or where yield is primarily limited by factors other than water and nutrients (e.g., pest pressure, genetic variation), the predictive advantage of texture features may be diminished. Furthermore, the current model was developed using data from a single variety (‘Wen 185’) under specific climatic and soil conditions in southern Xinjiang [38]; therefore, extrapolation to other varieties, regions, or management systems should be undertaken with caution and ideally preceded by local calibration.

Nonetheless, the model has several limitations. First, it depends on unmanned aerial vehicle (UAV) platforms and multispectral sensors, leading to elevated data acquisition costs. Consequently, its implementation in large-scale orchards continues to encounter economic and operational challenges. Second, the model has yet to incorporate on-site monitoring data, such as soil moisture and leaf nutrition. In the future, it could integrate Internet of Things (IoT) sensor data to enhance the model’s explanatory power regarding mechanisms. Multi-source data fusion is recognized as a crucial direction for improving the accuracy and mechanistic understanding of agricultural models [39]. Third, while the current dataset comprises 150 samples with rigorous experimental design, the sample size remains relatively limited for complex machine learning and deep learning applications. Future work should prioritize expanding the dataset across multiple growing seasons, geographical locations, and walnut varieties to enhance statistical power and model generalizability. Fourth, it is important to acknowledge the uncertainties associated with the metrics used to characterize water–fertilizer coupling. While our experimental design included precisely recorded irrigation and fertilization amounts, these represent planned rather than actual crop-available inputs, as spatial variability in soil moisture distribution, nutrient leaching, and root uptake efficiency introduce discrepancies between applied and absorbed resources. Additionally, the remote sensing features employed in this study—particularly texture features—serve as indirect proxies for canopy responses to water and nutrient availability rather than direct measurements of the coupling process itself. The extraction of texture features is also subject to uncertainties related to image acquisition conditions (e.g., illumination variation, sensor noise) and parameter selection (e.g., window size, grayscale quantization levels). These sources of uncertainty may influence the stability and interpretability of the derived water–fertilizer coupling metrics. Future research should aim to integrate ground-based sensor networks for real-time monitoring of soil moisture and nutrient dynamics, and to develop robust uncertainty quantification frameworks that can propagate these uncertainties through the modeling pipeline, thereby enhancing the reliability of yield predictions under varying water and fertilizer regimes. Fifth, although the machine learning algorithms employed in this study (RF, SVM, PLSR, RR) are well-established and provide strong interpretability through SHAP analysis, the incorporation of advanced deep learning methods represents a promising direction for further methodological advancement. Convolutional neural networks (CNNs) could be leveraged for automated extraction of hierarchical spectral and texture features directly from raw multispectral imagery, potentially capturing more complex spatial patterns than handcrafted texture features. Additionally, time-series deep learning architectures such as long short-term memory (LSTM) networks or Transformers could be employed to model the temporal dependencies across growth stages more effectively than the current stacking approach. Sixth, the current model was developed and validated using data from a single growing season (2025), lacking cross-year validation. Inter-annual variability in weather conditions, initial tree nutrient status, and crop responses to water–fertilizer management may affect model transferability. It should be noted that the research team has arranged for subsequent graduate students to conduct an identical repeat experiment at the same site in 2026. The newly collected independent data will be used to externally validate the proposed GSSE model and assess its inter-annual generalizability. This cross-year validation will be carried out in future work. Future research will explore the integration of these deep learning methods with the interpretable GSSE framework to further validate the superiority of multi-stage feature integration while maintaining mechanistic transparency. Future research should aim to establish a more adaptable yield prediction system by integrating experimental data from various ecological zones, tree ages, and varieties, thereby facilitating the application of technological advancements in diverse production scenarios.

5. Conclusions

In this study, a high-precision and interpretable walnut yield estimation model was developed by integrating multi-source UAV remote sensing data with interpretable machine learning under varying water–fertilizer coupling conditions. Based on the present dataset and experimental conditions, the main findings are summarized as follows.

Red-edge texture features showed higher predictive importance in this dataset and may contribute under the studied conditions. Correlation analysis and SHAP evaluation revealed that texture features from the red-edge band (720 nm)—particularly variance, contrast, and dissimilarity—showed higher correlation coefficients and SHAP importance values than traditional vegetation indices in predicting walnut yield, suggesting a potential ability to capture canopy structural heterogeneity induced by differential water and fertilizer management.

Among the four individual algorithms evaluated, random forest achieved the highest numerical values for validation R² (0.670) and RPD (1.836) in this dataset. The proposed growth stage stacking ensemble (GSSE) model using the optimal S1S2 combination achieved a validation R² of 0.789, an RMSE of 0.494, and an RPD of 2.296, which were numerically higher than those observed for single models and conventional stacked ensemble learning under the present dataset.

Under the conditions of this study, the oil conversion stage showed the highest contribution to yield estimation in this dataset. The oil conversion stage (S2) contributed up to 60% of the final yield prediction, followed by the hard shell stage (S1, 35%) and the maturity stage (S3, 5%), which may reflect a higher sensitivity of walnut yield to water–fertilizer management during oil accumulation.

The red-edge texture features consistently ranked as top contributors in SHAP analysis, suggesting a potential dual role as both structural indicators and indirect proxies of water-nitrogen stress.

Importantly, all performance comparisons are descriptive and based solely on numerical values observed in this dataset. No formal significance testing was conducted. The findings are exploratory and specific to the current dataset and experimental conditions. Generalization to other orchards, varieties, or management regimes requires further validation.

Author Contributions

Y.Y., Writing—Original draft, Validation, Investigation, Data curation; Q.X., Writing—Original draft, Methodology, Formal analysis, Conceptualization; L.L., Supervision, Resources; J.C., Visualization, Supervision, Investigation; J.Q., Software, Resources, Methodology; Z.G., Writing—review and editing, Validation, Investigation; C.Z., Writing—review and editing, Visualization, Supervision, Investigation; Y.Z., Writing—review and editing, Visualization, Supervision, Investigation, Funding acquisition; R.Z., Writing—review and editing, Visualization, Supervision, Investigation, Funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program (2022YFD1000102); Xinjiang Production and Construction Corps Southern Xinjiang Characteristic Forest and Fruit Technology Innovation Project; Tarim University President’s Fund Major ProjectCultivation Project (TDZKZD202403); Autonomous Region Walnut Industry Technology System Project (XJLGCYJSTX01-09); Southern Xinjiang Key Industry Innovation (2022DB022).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors are grateful to the National and Local Joint Engineering Laboratory of High Efficiency and Superior-Quality Cultivation and Fruit Deep Processing Technology of Characteristic Fruit Trees in South Xinjiang at Tarim University for providing the experimental facilities and technical assistance. We also extend our thanks to all colleagues who assisted in the sample collection and data measurement.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Phillips, S. Precision agriculture: Supporting global food security. Better Crops Plant Food 2014. [Google Scholar]
Food and Agriculture Organization. State of food and agriculture. State Food Agric. 1996, 71, 110–113. [Google Scholar]
Mulla David, J. Twenty five years of remote sensing in precision agriculture: Key advances and remaining knowledge gaps. Biosyst. Eng. 2013, 114, 358–371. [Google Scholar] [CrossRef]
Yin, N.; Liu, R.; Zeng, B.; Liu, N. A review: UAV-based Remote Sensing. IOP Conf. Ser. Mater. Sci. Eng. 2019, 490, 062014. [Google Scholar] [CrossRef]
Cui, D.; Li, M.; Zhu, Y.; Cao, W.; Zhang, X. Monitoring Crop Growth Status Based on Optical Sensor. In Computer and Computing Technologies in Agriculture, Volume II; Springer: Boston, MA, USA, 2008. [Google Scholar]
Weiss, M.; Jacob, F.; Duveiller, G. Remote sensing for agricultural applications: A meta-review. Remote Sens. Environ. 2020, 236, 19. [Google Scholar] [CrossRef]
Ramesh, V.; Kumaresan, P. Advancements in Machine Learning and Deep Learning Techniques for Crop Yield Prediction: A Comprehensive Review. Nat. Environ. Pollut. Technol. 2024, 23, 2071–2086. [Google Scholar] [CrossRef]
Lee, K.; Park, C.W.; Ahn, H.Y.; Hong, S.Y.; Jang, S.Y.; Na, S.; So, K.H. Estimation of Rice Leaf Nitrogen Content and Yield using UAV Image. Korean J. Soil Sci. Fertil. 2020, 53, 335–344. [Google Scholar] [CrossRef]
Sun, X.; Yang, Z.; Su, P.; Wei, K.; Wang, Z.; Yang, C.; Wang, C.; Qin, M.; Xiao, L.; Yang, W.; et al. Non-destructive monitoring of maize LAI by fusing UAV spectral and textural features. Front. Plant Sci. 2023, 14, 1158837. [Google Scholar] [CrossRef]
Condon, A.G.; Richards, R.A.; Rebetzke, G.J.; Farquhar, G. Improving Intrinsic Water-Use Efficiency and Crop Yield. Crop Sci. 2002, 42, 122–131. [Google Scholar]
Zhao, L.; Tang, Q.; Song, Z.; Yin, Y.; Wang, G.; Li, Y. Increasing the yield of drip-irrigated rice by improving photosynthetic performance and enhancing nitrogen metabolism through optimizing water and nitrogen management. Front. Plant Sci. 2023, 14, 1075625. [Google Scholar] [CrossRef]
Hassan, F.A.S.; Ali, E.F.; Mahfouz, S.A. Comparison between different fertilization sources, irrigation frequency and their combinations on the growth and yield of coriander plant. Aust. J. Basic Appl. Sci. 2012, 6, 600–615. [Google Scholar]
Cutler, A.; Cutler, D.R.; Stevens, J.R. Random Forests. Mach. Learn. 2004, 45, 157–176. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. arXiv 2016, arXiv:1603.02754. [Google Scholar] [CrossRef]
Roscher, R.; Bohn, B.; Duarte, M.F.; Garcke, J. Explainable Machine Learning for Scientific Insights and Discoveries. Qual. Control. Trans. 2020, 8, 42200–42216. [Google Scholar] [CrossRef]
Lundberg, S.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
Yeerhazi, Y.; Xia, Q.; Chen, T.; Ahmed, M.M.; Nuerdawulieti, W.; Zhang, R.; Yang, G.; Ding, Y.; Guo, Z. Optimization of Walnut Water-Fertilizer Coupling Schemes Based on Yield, Quality, and Water-Fertilizer Utilization Efficiency. J. Fruit Sci. 2026, 43, 424–438. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D. Monitoring Vegetation Systems in the Great Plains with Erts; NASA: Washington, DC, USA, 1974; Volume 351. [Google Scholar]
Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
Li, C.; Wang, Y.; Ma, C.; Ding, F.; Li, Y.; Chen, W.; Li, J.; Xiao, Z. Hyperspectral Estimation of Winter Wheat Leaf Area Index Based on Continuous Wavelet Transform and Fractional Order Differentiation. Sensors 2021, 21, 8497. [Google Scholar] [CrossRef]
Breiman, L. Random Forests Machine Learning. J. Clin. Microbiol. 2001, 2, 199–228. [Google Scholar]
Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
Wold, S.; Sjstrm, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
Jia, Z.; Sun, M.; Ou, C.; Sun, S.; Mao, C.; Hong, L.; Wang, J.; Li, M.; Jia, S.; Mao, P. Single Seed Identification in Three Medicago Species via Multispectral Imaging Combined with Stacking Ensemble Learning. Sensors 2022, 22, 7521. [Google Scholar] [CrossRef] [PubMed]
Hassan, M.A.; Fei, S.; Li, L.; Jin, Y.; Liu, P.; Rasheed, A.; Shawai, R.S.; Zhang, L.; Ma, A.; Xiao, Y.; et al. Stacking of canopy spectral reflectance from multiple growth stages improves grain yield prediction under full and limited irrigation in wheat. Remote Sens. 2022, 14, 4318. [Google Scholar] [CrossRef]
Li, F.; Miao, Y.; Feng, G.; Yuan, F.; Yue, S.; Gao, X.; Liu, Y.; Liu, B.; Ustin, S.L.; Chen, X. Improving estimation of summer maize nitrogen status with red edge-based spectral vegetation indices. Field Crops Res. 2014, 157, 111–123. [Google Scholar] [CrossRef]
Chang, C.W.; Laird, D.A.; Mausbach, M.J.; Hurburgh, C.R. Near-Infrared Reflectance Spectroscopy—Principal Components Regression Analyses of Soil Properties. Soil Sci. Soc. Am. J. 2001, 65, 480–490. [Google Scholar] [CrossRef]
Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2020, 41, 647–665. [Google Scholar] [CrossRef]
Marcone, A.; Impollonia, G.; Croci, M.; Blandinières, H.; Pellegrini, N.; Amaducci, S. Garlic yield monitoring using vegetation indices and texture features derived from UAV multispectral imagery. Microelectron. J. 2024, 8, 100513. [Google Scholar] [CrossRef]
Sun, L.; Gao, F.; Anderson, M.C.; Kustas, W.P.; Alsina, M.M.; Sanchez, L.; Sams, B.; McKee, L.; Dulaney, W.; White, W.A.; et al. Daily Mapping of 30 m LAI and NDVI for Grape Yield Prediction in California Vineyards. Remote Sens. 2017, 9, 317. [Google Scholar] [CrossRef]
Jang, C.; Namoi, N.; Wolske, E.; Wasonga, D.; Behnke, G.; Bowman, N.D.; Lee, D.K. Integrating plant morphological traits with remote-sensed multispectral imageries for accurate corn grain yield prediction. PLoS ONE 2024, 19, e0297027. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Siegmann, B.; Jarmer, T.; Beyer, F.; Ehlers, M. The Potential of Pan-Sharpened EnMAP Data for the Assessment of Wheat LAI. Remote Sens. 2015, 7, 12737–12762. [Google Scholar] [CrossRef]
Maimaitiyiming, M.; Ghulam, A.; Bozzolo, A.; Wilkins, J.L.; Kwasniewski, M.T. Early Detection of Plant Physiological Responses to Different Levels of Water Stress Using Reflectance Spectroscopy. Remote Sens. 2017, 9, 745. [Google Scholar] [CrossRef]
Wolpert, D.H. Stacked Generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
Zhou, G.; Chen, F.; Sun, S.; Lyu, W.; Piao, H.; Hao, J.; Zhang, S.; Chen, H. The effect of water fertilizer coupling on the photosynthetic characteristics, yield, and quality of walnuts. Xinjiang Agric. Sci. 2024, 61, 1151–1159. [Google Scholar]
Dordas, C. Role of nutrients in controlling plant diseases in sustainable agriculture. A review. Agron. Sustain. Dev. 2008, 28, 33–46. [Google Scholar] [CrossRef]
Jin, X.; Liu, S.; Baret, F.; Hemerlé, M.; Comar, A. Estimates of plant density of wheat crops at emergence from very low altitude UAV imagery. Remote Sens. Environ. 2017, 198, 105–114. [Google Scholar] [CrossRef]
Shi, X.; Han, W.; Zhao, T.; Tang, J. Decision Support System for Variable Rate Irrigation Based on UAV Multispectral Remote Sensing. Sensors 2019, 19, 2880. [Google Scholar] [CrossRef]

Figure 1. Overview of the test site. (a) Location of the Xinjiang Uygur Autonomous Region in China, with the Alar City study site indicated; (b) Expanded view of the experimental site showing the layout of water and fertilizer treatments (W1F1, W1F2, W2F1, W2F2, CK) and UAV flight coverage.

Figure 2. Technology roadmap.

Figure 3. Correlation analysis of vegetation index, texture characteristics and walnut yield. The left vertical axis lists the feature names (including vegetation indices and texture features from different bands). The color bar represents the Pearson correlation coefficient (r). Red suggests positive correlation, blue suggest negative correlation.

Figure 4. Importance analysis of features at different reproductive periods, S1 (sclerotial stage), S2 (grease transformation stage), and S3 (maturity stage). The importance of each feature was ranked in order from top to bottom based on the mean absolute SHAP value.

Figure 5. Performance of machine learning models. (A) Distribution of R² and RMSE across all models for the training set, (B) Distribution of R² and RMSE across all models for the validation set.

Figure 6. Comparison of the estimation results of SEL with the basic models (RF, SVM, PLSR, RR). (A) shows the distribution of R² for the validation set of the model at the sclerotial, grease transformation, and maturity stages, (B) shows the distribution of RMSE for the validation set of the model at the sclerotial, grease transformation, and maturity stages, (C) shows the distribution of RPD for the validation set of the model at the sclerotial, grease transformation, and maturity stages, and (D) summarizes the average R2 for each of the three growth stages and demonstrates its variability. Different lowercase letters (a,b) suggest statistically significant differences between stages.

Figure 7. Stacked yield estimation results for different fertility periods. (A) shows R² of seven combinations, (B) shows RMSE of seven combinations, (C) shows RPD of seven combinations, (D) The R², root mean square error (RMSE) and relative prediction deviation (RPD) of the optimal combination S1S2 validation set; (E) shows the contribution of the optimal combination S1S2; (F) shows the comparison of single-stage, two-stage three-stage.

Figure 8. Estimated yield of superimposed walnuts at different fertility periods.

Figure 9. Spatial distribution of walnut production.

Table 1. Experimental design of irrigation of walnut.

Irrigation Sequence	Irrigation Date	Irrigation Cycle	Irrigation Quota (m³·667 m⁻²)
Irrigation Sequence	Irrigation Date	Irrigation Cycle	W1	W2
Spring irrigation	3.5–3.25		100	100
1	5.1–5.7	6	37.5	31.25
2	5.17–5.23	6	37.5	31.25
3	6.3–6.9	6	37.5	31.25
4	6.19–6.25	6	37.5	31.25
5	7.5–7.11	6	37.5	31.25
6	7.21–7.27	6	37.5	31.25
7	8.6–8.15	6	37.5	31.25
8	8.25–9.1	6	37.5	31.25
Winter irrigation	11.5–11.10		100	100
Total			500	450

Table 2. Experimental design of fertilization of walnut (kg·667 m⁻²).

Fertilization Time		F1			F2			F3
Fertilization Time	Urea	Monoam Monium	Potassium Sulfate	Urea	Monoam Monium	Potassium Sulfate	Urea	Monoam Monium	Potassium Sulfate
5.1–5.7	6.7	4.7	0.67	8	5.7	0.8	6.7	4.7	0.67
5.17–5.23	6.7	4.7	0.67	8	5.7	0.8	6.7	4.7	0.67
6.3–6.9	6.7	4.7	0.67	8	5.7	0.8	6.7	4.7	0.67
6.19–6.25	6.7	4.7	0.67	8	5.7	0.8	6.7	4.7	0.67
7.5–7.11	5	7.5	3	5	7.5	3	5	7.5	3
7.21–7.27	5	7.5	3	5	7.5	3	5	7.5	3
8.6–8.15	5	7.5	3	5	7.5	3	5	7.5	3
8.25–9.1	5	7.5	3	5	7.5	3	5	7.5	3

Note: Refer to “Xinjiang Walnut Cultivation and Management” to determine the amount of drip fertilizer according to the target yield.

Table 3. Vegetation indices and textural characteristics associated with yield.

	Variable Name	Full Name	Calculation Formula
Vegetation Index	Normalized difference vegetation index	NDVI	(NIR − R)/(NIR + R)
	Different influential factors	DVI	NIR − R
	Ratio vegetation index	RVI	NIR/R
	Green normalized difference vegetation index	GNDVI	(NIR − G)/(NIR + G)
	Blue normalized difference vegetation index	BNDVI	(NIR − B)/(NIR + B)
	Normalized difference red edge index	NDRE	(NIR − RE)/(NIR + RE)
	Chlorophyll index-red edge	CIrededge	NIR/RE − 1
	Chlorophyll index-green	CIgreen	NIR/G − 1
	Excess green index	ExG	2G − R − B
	Normalized green–red difference index	NGRDI	(G − R)/(G + R)
	Visible-band difference vegetation index	VDVI	(2G − R − B)/(2G + R + B)
	Enhanced vegetation index	EVI	2.5(NIR − R)/(NIR + 6R − 7.5B + 1)
	Soil adjusted vegetable index	SAVI	(1 + L)(NIR − R)/(NIR + 6R − 7.5B + 1)
Texture features	MEAN	Mean	$m e a n = \sum_{i, j = 0}^{N - 1} i P_{i, j}$
	HOM	Homogeneity	$h o m = \sum_{i, j = 0}^{N - 1} i \frac{P_{i, j}}{1 + {(i - j)}^{2}}$
	ENT	Entropy	$e n t = \sum_{i, j = 0}^{N - 1} i P_{i, j} (- \ln P_{i, j})$
	DIS	Dissimilarity	$d i s = \sum_{i, j = 0}^{N - 1} i P_{i, j} \| i - j \|$
	SEC	Second moment	$s m = \sum_{i, j = 0}^{N - 1} i P_{i, j}^{2}$
	COR	Correlation	$c o r r = \sum_{i, j = 0}^{N - 1} i P_{i, j} [\frac{(i - mean) (j - mean)}{\sqrt{{var}_{i} \cdot {var}_{j}}}]$
	VAR	Variance	$v a r = \sum_{i, j = 0}^{N - 1} i P_{i, j} {(i - m e a n)}^{2}$
	CON	Contrast	$c o n = \sum_{i, j = 0}^{N - 1} i P_{i, j} {(i - j)}^{2}$

Note: B, G, R, RE, and NIR represent the spectral reflectances of the blue, green, red, red edge, and near-infrared bands, respectively; L is the soil adjustment coefficient, which is set to 0.5.

P_{i, j} = \frac{V_{i, j}}{\sum_{i, j = 0}^{N - 1} V_{i, j}}

V_i_,j are the brightness values of pixels in the i-th row and j-th column, and N is the size of the moving window when calculating texture measurement.

Table 4. Yield predictions of four machine learning at different fertility periods.

			Training Set			Validation Set
		R²	RMSE	RPD	R²	RMSE	RPD
S1	PLSR	0.554	0.782	1.517	0.286	1.020	1.248
	SVM	0.197	1.052	1.130	0.193	1.021	1.174
	RF	0.688	0.679	1.813	0.670	0.568	1.836
	RR	0.300	1.011	1.211	0.157	0.964	1.148
S2	PLSR	0.696	0.638	1.837	0.561	0.833	1.592
	SVM	0.507	0.782	1.442	0.438	1.024	1.406
	RF	0.683	0.665	1.798	0.607	0.705	1.681
	RR	0.798	0.535	2.256	0.525	0.774	1.529
S3	PLSR	0.074	1.155	1.053	0.105	1.117	1.003
	SVM	0.453	0.770	1.369	0.238	1.354	1.207
	RF	0.676	0.702	1.780	0.583	0.578	1.633
	RR	0.192	1.004	1.127	0.254	1.559	1.941

Note: S1 (sclerotial stage), S2 (grease transformation stage), S3 (maturation stage). The definitions are based on standard walnut phenology: S1 corresponds to endocarp hardening (late May–mid-June), S2 corresponds to kernel oil accumulation (mid-July–mid-August), and S3 corresponds to fruit ripening (late August–mid-September).

Table 5. Comparison of the best GSSE model with the best SEL model and the best single machine learning model.

	R²	RMSE	RPD
GSSE	0.789	0.494	2.296
SEL	0.681	0.283	1.177
RF	0.670	0.568	1.836

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yerzati, Y.; Xia, Q.; Luo, L.; Chen, J.; Qi, J.; Guo, Z.; Zhai, C.; Zhang, Y.; Zhang, R. Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes. Remote Sens. 2026, 18, 1449. https://doi.org/10.3390/rs18101449

AMA Style

Yerzati Y, Xia Q, Luo L, Chen J, Qi J, Guo Z, Zhai C, Zhang Y, Zhang R. Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes. Remote Sensing. 2026; 18(10):1449. https://doi.org/10.3390/rs18101449

Chicago/Turabian Style

Yerzati, Yerhazi, Qiuhao Xia, Langqin Luo, Jiaxing Chen, Jiahui Qi, Zhongzhong Guo, Changyuan Zhai, Yunqi Zhang, and Rui Zhang. 2026. "Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes" Remote Sensing 18, no. 10: 1449. https://doi.org/10.3390/rs18101449

APA Style

Yerzati, Y., Xia, Q., Luo, L., Chen, J., Qi, J., Guo, Z., Zhai, C., Zhang, Y., & Zhang, R. (2026). Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes. Remote Sensing, 18(10), 1449. https://doi.org/10.3390/rs18101449

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Walnut Yield Estimation Based on Interpretable Machine Learning and Stacked Integration Under Different Water–Fertilizer Coupling Regimes

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of the Study Area

2.2. Experimental Design

2.3. Acquisition of UAV Multispectral Images

2.4. UAV Image Preprocessing

2.5. Test Indicators and Methods

2.5.1. Walnut Production

2.5.2. Extraction of Vegetation Indices

2.5.3. Extraction of Texture Features

2.5.4. Individual Machine Learning Algorithms

2.5.5. Stacked Integration Learning

2.5.6. Growth Stage Stacking Integration

2.5.7. Validation and Analysis

2.5.8. Analysis of SHAP Feature Contributions

2.6. Data-Processing Software

3. Results

3.1. Correlation of Vegetation Indices, Texture Characteristics with Yield and Selection of Model Input Variables

3.1.1. Correlation of Vegetation Indices, Texture Characteristics and Yield

3.1.2. Selection of Model Variables

3.2. Results of Yield Estimation and Accuracy Analysis

3.2.1. Estimation Results and Accuracy Analysis Using a Single Machine Learning Model

3.2.2. SEL Estimation Results and Accuracy Analysis

3.2.3. Estimation Results and Precision Analysis of GSSE

3.3. Analysis of the Impact of Information on GSSE by Reproductive Period

3.4. Spatial Distribution Map of Fruit Yield

4. Discussion

4.1. Mechanisms for the Contribution of Remote Sensing Features to Yield Estimation

4.2. Machine Learning Model Performance Comparison and Integration Strategy Advantages

4.3. Interpretability Analysis of Water–Fertilizer Coupling Effects

4.4. Prospects and Limitations of Model Application

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI