Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach

Lin, Chenshuang; Yan, Zhitao; Miao, Shujie

doi:10.3390/atmos17060599

Open AccessArticle

Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach

by

Chenshuang Lin

^1,2,

Zhitao Yan

² and

Shujie Miao

^1,*

¹

School of Ecology and Applied Meteorology, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

Jiangbei District Meteorological Bureau of Ningbo, Ningbo 315000, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2026, 17(6), 599; https://doi.org/10.3390/atmos17060599

Submission received: 17 April 2026 / Revised: 1 June 2026 / Accepted: 8 June 2026 / Published: 11 June 2026

(This article belongs to the Special Issue Recent Advances in Agrometeorological Techniques and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

Quantifying the nonlinear response of crop systems to meteorological driving factors remains a core challenge in agrometeorology. Although Explainable Artificial Intelligence (XAI) offers new approaches, existing SHAP-based threshold identification methods are largely confined to shifts in effect direction. Furthermore, a unified quantitative grading scale for interaction effects among factors is lacking. To explore the meteorological factor thresholds and interaction effect intensities affecting rice yield, rice unit yield and meteorological data from nine districts and counties in Ningbo City from 1995 to 2024 were utilized. Rice yield prediction models were constructed based on LASSO and six machine learning algorithms. Recursive Feature Elimination (RFE) based on the SHAP algorithm was conducted to screen out 11 core meteorological factors. Building upon this, two innovative methodological indicators were proposed. First, the Derivative Extrema Threshold (DET) was introduced as a supplement to the Zero-Crossing Threshold (ZCT). By locating the extremum points of the first derivative of the smoothed SHAP dependence plot curves, the critical positions where the effect intensity undergoes a qualitative change without a directional reversal were identified. Second, the Interaction Dominance Ratio (IDR) was proposed. This metric normalizes the interaction variability within a total effect framework and establishes a three-tier grading standard for strong, moderate, and weak interactions. It was observed that optimal performance was achieved by the LightGBM model after feature optimization (R² = 0.833). Direction reversal points with extremely narrow confidence intervals, such as an August cumulative precipitation of 210.6 mm and a June average temperature of 24.5 °C, were identified by the ZCT. Intensity mutation characteristics, such as the “weakening of the yield reduction effect” at a May cumulative precipitation of 64.9 mm, were further revealed by the DET. An Interaction Dominance Triangular Network, composed of the August–September average temperature, the June minimum temperature, and the August cumulative precipitation, was accurately characterized by the IDR analysis. This overcomes the constraints of traditional single-factor early warning systems. The “ZCT-DET-IDR” framework constructed in this study facilitates a methodological advancement from directional discrimination and intensity early warning to multi-factor synergistic analysis. This framework provides a quantifiable novel perspective for the refined early warning of regional agrometeorological disasters.

Keywords:

Explainable Artificial Intelligence (XAI); SHAP; Derivative Extrema Threshold (DET); Interaction Dominance Ratio (IDR); rice yield prediction; meteorological factors

1. Introduction

Rice is one of the most important staple food crops in China. Its stable production is directly related to national food security. The accurate prediction of rice yield and the elucidation of the influence mechanisms of meteorological factors on yield are of great significance for guiding agricultural production and formulating disaster prevention and mitigation measures. The methodologies for predicting agricultural product yields have evolved from statistical analysis to mechanistic simulation, and ultimately to machine learning [1,2,3,4]. Traditional regression or correlation analysis models are simple to construct. However, they are often confined to specific spatiotemporal conditions. These models are inadequate for handling the complex nonlinear problems inherent in crop growth and development. Crop growth, development, and yield formation are essentially nonlinear natural processes. Therefore, nonlinear prediction models utilizing modern methods, such as machine learning and deep learning, significantly outperform traditional models in terms of accuracy. These advanced models can effectively characterize the nonlinear and uncertain features prevalent in agricultural systems.

Despite the significant advancements in the prediction accuracy of machine learning models, their “black-box” nature restricts the understanding of their decision-making processes, hindering their adoption in yield prediction [5,6]. Explainable Artificial Intelligence (XAI) addresses this issue by providing transparent and comprehensible explanations for model decisions while maintaining predictive performance [7]. Among various XAI methods, SHAP (SHapley Additive exPlanations) stands out due to its solid theoretical foundation in game theory and comprehensive interpretability, and has become one of the most widely utilized interpretability tools in agricultural yield prediction. SHAP quantifies the marginal contribution of each feature to the prediction result by decomposing the model prediction into the sum of individual feature contributions.

It is important to note that SHAP is not the only available interpretability method. Partial Dependence Plots (PDP) [8] provide a straightforward visualization of the marginal effect of a feature on the predicted outcome by averaging predictions over the distribution of complementary features. However, PDP can obscure heterogeneous effects when feature interactions are present, as the averaging process may combine subgroups with opposing response patterns into a flat average. Accumulated Local Effects (ALEs) plots [9] address this limitation by computing conditional rather than marginal effects, making them more robust to feature correlations and more computationally efficient. Permutation importance [10] offers a model-agnostic measure of feature relevance by assessing the degradation in model performance when a feature’s values are randomly shuffled. While computationally simple, it provides only a global importance ranking without revealing the direction or shape of feature effects. More recently, causal inference frameworks [11] have been proposed to move beyond associational interpretations of machine learning models toward identifying causal mechanisms. These methods represent complementary approaches to SHAP, each with distinct trade-offs between computational cost, interpretive depth, and robustness to feature correlations. SHAP was selected for the present study because it uniquely provides both local and global interpretability, supports interaction value decomposition, and offers a theoretically grounded allocation of feature contributions—properties that are essential for the threshold identification and interaction dominance analysis developed herein.

In the context of maize yield prediction in Kenya, researchers integrated XGBoost with SHAP. The results demonstrated that SHAP dependence plots could reveal the negative effects of high temperatures on maize yield. They also identified the law of diminishing returns once fertilizer inputs exceeded an optimal threshold [12]. Regarding wheat yield prediction, an interpretable yield estimation model was developed using LightGBM and SHAP. This approach revealed the contribution mechanisms of remote sensing indices to yield formation [13]. In domestic research, scholars have similarly applied SHAP for feature screening and model interpretation in wheat yield prediction. Identifying nonlinear thresholds from SHAP dependence plots has emerged as a cutting-edge direction in the current agrometeorological field [14]. For instance, based on a coupling framework of the APSIM crop growth model and Random Forest-SHAP, researchers accurately identified a key threshold of 75.5 mm for the intense precipitation index (R95p) during the wheat growth period. When intense precipitation was below this threshold, increased precipitation facilitated yield gains. Once the threshold was exceeded, the beneficial effect rapidly reversed into a yield reduction risk [14]. In another study utilizing an XGBoost-SHAP framework during extreme precipitation years, a distinct nonlinear threshold relationship between soil sand content and maize yield was observed. Excessively low sand content (approximately 12.85%) exacerbated waterlogging damage, leading to yield reduction. Conversely, when the sand content ranged from 22% to 30%, the impact shifted from inhibitory to promotive [15]. Collectively, these studies demonstrate that SHAP is not merely a visualization tool for feature importance. It can also serve as an effective instrument for quantifying the nonlinear threshold effects of meteorological factors.

The Zero-Crossing Threshold (ZCT) based on SHAP dependence plots has been extensively applied to identify critical transition points between yield increase and decrease. For example, in a study on landscape ecological risk in the Qilian Mountains, researchers employed spline regression and constraint line methods to identify threshold inflection points for altitude (4200 m), downward shortwave radiation (2502 W/m²), and a dual-threshold response for grazing intensity (3.35 and 14.36 SU/ha) [16]. However, the ZCT can only capture critical points where the direction of the effect undergoes a fundamental shift. It fails to identify situations where the effect direction remains unchanged but the effect intensity varies significantly (e.g., a yield-increase effect transitioning from rapid growth to saturation, or a yield-reduction effect progressively accelerating). These “qualitative change points of effect intensity” hold equally important implications for agricultural early warning, indicating either the saturation point of a suitable range or the acceleration point of disaster losses. Furthermore, analyses of SHAP dependence plots have largely remained at the qualitative description level, and few studies have quantified curve slope changes to identify critical positions of effect intensity transformation. In eco-logical threshold research, first derivatives or inflection point detection have been widely utilized to locate such positions [17], yet this method has not been systematically introduced into SHAP analyses for agricultural yield prediction. In view of these deficiencies, a Derivative Extrema Threshold (DET) detection method is introduced alongside the ZCT analysis. Unlike existing threshold identification approaches in ecological modeling that typically rely on piecewise regression or change-point detection algorithms [18], the DET is based on the extreme points of the first derivative of the smoothed SHAP dependence plot curves. By capturing the extreme positions where the curve slope changes most rapidly, the DET achieves the quantitative localization of qualitative change points in effect intensity without requiring a directional reversal. This approach bridges the gap in the existing threshold identification system. Concurrently, to address the bottleneck of missing normalized quantitative benchmarks for factor interaction effects, the Interaction Dominance Ratio (IDR) is proposed. Unlike traditional interaction metrics such as the H-statistic [19] or variance-based sensitivity indices [20] that quantify interaction strength in absolute terms, the IDR eliminates interference from feature dimensions and scales by constructing a dimensionless ratio of the interaction variation span to the total effect discrete degree. Furthermore, based on the internal distribution characteristics of the data, a three-tier grading standard comprising strong, moderate, and weak levels is established. Through the construction of the afore-mentioned ‘ZCT-DET-IDR’ coupling method, this study aims to systematically elucidate the nonlinear influences and synergistic mechanisms of meteorological factors on rice yield in the Ningbo region, ultimately providing a quantifiable decision-making basis for regional rice yield prediction and the refined, composite early warning of agrometeoro-logical disasters.

2. Materials and Methods

2.1. Data Sources

The study region is Ningbo City (28°51′–30°33′ N, 120°55′–122°16′ E), located on the southeastern coast of China, covering a total land area of approximately 9365 km² (Figure 1). Ningbo is situated in the northern part of Zhejiang Province and lies within the subtropical monsoon climate zone, characterized by hot, humid summers and mild winters, with an annual mean temperature of approximately 16–17 °C and annual precipitation of approximately 1300–1500 mm. Single-season rice is the predominant rice cropping system in this region, with the growing season generally extending from March to October.

Rice yield data were obtained from the Ningbo Statistical Yearbook (1995–2024). These data encompass the unit area yields of rice across nine districts and counties in Ningbo City: Jiangbei, Zhenhai, Beilun, Yinzhou, Fenghua, Yuyao, Cixi, Ninghai, and Xiangshan. Meteorological data were provided by the Ningbo Meteorological Bureau. A total of nine national quality-controlled meteorological stations, one representative station per district or county, were utilized in this study. The records comprise daily observations from March to October for the period 1995–2024 (30 years), subsequently aggregated into monthly averages or cumulative values for model input. The variables include average temperature, maximum temperature, minimum temperature, relative humidity, precipitation, and maximum wind speed. It should be noted that these meteorological stations are not installed directly within rice fields. However, their site selection strictly adheres to the standard construction criteria for national meteorological stations in China, which ensures that each station is situated in a location capable of accurately representing the regional climate characteristics of its administrative area. This design principle minimizes microenvironmental biases and ensures the spatial representativeness of the meteorological observations for the corresponding rice production areas.

2.2. Data Processing

All data preprocessing and visualization were implemented using Python (version 3.13.9). In rice yield prediction studies, the actual crop yield is generally decomposed into three components: Trend Yield, Meteorological Yield, and random error. The Hodrick–Prescott (HP) filter was applied to separate the Trend Yield from the unit rice yield, thereby deriving the Meteorological Yield [21,22].

Temporal Feature Expansion was applied to the monthly average data of six variables: average temperature, maximum temperature, minimum temperature, relative humidity, precipitation, and maximum wind speed. Taking the average temperature in Ningbo (March to October over 30 years across nine districts) as an example, the months were combined sequentially into durations of 8, 7, down to 1 month. For instance, TAVG_4-6 denotes the average temperature from April to June, whereas R2020_5-9 represents the cumulative precipitation from May to September. Through this procedure, a total of 216 expanded features were generated.

To eliminate linear collinearity among factors and enhance model prediction accuracy, the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm was employed for preliminary feature screening. 15 meteorological expanded features were retained: average temperature in March (TAVG_3), average temperature from March to May (TAVG_3-5), average temperature in June (TAVG_6), average temperature from June to July (TAVG_6-7), average temperature from August to September (TAVG_8-9), average temperature in October (TAVG_10), average minimum temperature in June (TMIN_6), maximum wind speed in September (WSMAX_9), average relative humidity in September (UAVG_9), cumulative precipitation from March to June (R2020_3-6), cumulative precipitation from April to May (R2020_4-5), cumulative precipitation from April to July (R2020_4-7), cumulative precipitation in May (R2020_5), cumulative precipitation in August (R2020_8), and cumulative precipitation in October (R2020_10).

Regarding spatial dependence, the nine districts in Ningbo are geographically adjacent, which implies potential spatial correlation in agricultural production. This was addressed implicitly through the data decomposition strategy. The Hodrick–Prescott (HP) filter (smoothing parameter λ = 100, which is the standard convention for annual time-series data) was applied to each district’s yield time series individually to extract the meteorological yield. By modeling this detrended component—specifically the inter-annual deviation from each district’s own trend—the analysis effectively controlled for district-specific fixed effects, such as baseline soil fertility and persistent management practices. Consequently, the study focused on the variable component of yield driven primarily by annual weather fluctuations. Given the limited sample size and the dominance of local meteorological drivers, treating each district-year as an independent observation was deemed appropriate for capturing the local crop-weather relationships without introducing the complexity of explicit spatial interaction terms, which could risk overfitting in this sample size regime. To strictly prevent data leakage during preprocessing, all feature scaling (standardization) and LASSO feature selection were fitted exclusively on the training set (1995–2018) and subsequently applied to the test set.

2.3. Model Construction

Multiple Linear Regression (MLR) and five common Machine Learning algorithms were selected to construct meteorological factor-based rice yield prediction models for the Ningbo region. To ensure optimal performance and prevent overfitting, hyperparameters for all machine learning models were tuned using a grid search combined with a 5-fold Time Series Split cross-validation strictly applied to the training set (1995–2018), maximizing the cross-validation R².

The parameter configurations and their tuning search spaces were specified as follows. For Support Vector Regression (SVR), the Radial Basis Function (RBF) was specified as the kernel function, with the regularization parameter C searched over {0.1, 1,10} and ϵϵ over {0.01, 0.1, 0.5}, yielding final values of C = 1 and ϵ = 0.1. For Bagged Trees, the ensemble size was searched over {100, 200, 300} (final: 200), the base learner was a decision tree, and the minimum number of samples per leaf node was set to 5. For Random Forest (RF), the number of trees was searched over {100, 200, 300} (final: 200), the minimum samples per leaf node over {5, 10, 15} (final: 5), and the number of features considered per split was the square root of the total features. For the Back Propagation Neural Network (BPNN), a single hidden layer structure was adopted with the number of neurons searched over {6, 9, 12} (final: 9). To mitigate overfitting, a dropout layer (rate = 0.25) was added after the hidden layer, and L2 regularization (coefficient = 5 × 10⁻⁵) was incorporated into the loss function. The training utilized the Adam optimizer with an initial learning rate of 0.001 and a maximum of 1000 iterations, alongside an early stopping mechanism. For the LightGBM model, hyperparameter tuning was conducted using a Bayesian optimization framework. The optimization was performed on the training set, maximizing the mean R2 of the 5-fold Time Series Split CV. The final selected hyperparameters were: learning_rate = 0.05, n_estimators = 100, num_leaves = 15, subsample = 0.8, and min_child_samples = 10.

The 15 meteorological expanded features and the rice Meteorological Yield were normalized. These were then utilized as the feature variables and the dependent variable, respectively. The samples were chronologically partitioned. A total of 216 samples from 1995 to 2018 constituted the training set, whereas 54 samples from 2019 to 2024 served as the testing set. The prediction performance of the six models was subsequently evaluated. To prevent data leakage and ensure a rigorous evaluation of generalization performance, a time-series nested validation strategy was implemented. During the Recursive Feature Elimination (RFE) phase, feature subsets were evaluated exclusively on the training set using a 5-fold Time Series Split cross-validation (CV). Unlike standard k-fold CV, this method splits the data chronologically, ensuring that validation folds always occur after training folds, thereby preventing future information leakage. The optimal number of features was determined by the mean R² across the 5-fold CV.

2.4. Nonlinear Threshold Identification Based on SHAP

SHAP is a model interpretation method derived from the Shapley value in game theory, originally proposed by Lundberg and Lee [23]. It fairly allocates the model’s prediction value to each input feature, ensuring that the contribution of each feature adheres to axioms such as additivity, consistency, and local accuracy.

Based on this definition, SHAP can be utilized in agricultural meteorology to calculate the thresholds of meteorological factors affecting rice yield. To eliminate the interference of data noise on threshold identification, a local smoothing procedure was initially applied to the scatter points in the dependence plots. Specifically, the feature values were sorted in ascending order. A moving average method was then utilized to compute a smoothed curve. The window width was set to 5% of the sample size (with a minimum of 5), and the mean of the SHAP values within the window was assigned as the smoothed value for that point. This smoothed curve represented the overall variation trend of the SHAP values.

For each dependence plot, the smoothed curve was examined to identify positions where the product of two adjacent points was less than or equal to zero (i.e., opposite signs or a zero crossing). If such a position existed, linear interpolation was applied to accurately calculate the zero-crossing coordinates. Here, x_i and x_i₊₁ denote two adjacent feature values with opposite signs, while ϕ_smooth(x_i) and ϕ_smooth(x_i₊₁) represent their corresponding smoothed SHAP values. This threshold directly indicated the critical point at which the predicted unit rice yield shifted from a yield-increase dominance (ϕ > 0) to a yield-decrease dominance (ϕ < 0):

x_{zero} = x_{i} + \frac{| ϕ_{smooth} (x_{i}) |}{| ϕ_{smooth} (x_{i}) | + | ϕ_{smooth} (x_{i + 1}) |} \cdot (x_{i + 1} - x_{i})

(1)

In the field of ecological threshold research, first derivatives or inflection point detections have been widely utilized to locate critical positions of morphological changes in curves [24]. However, this approach has not been systematically introduced into SHAP analyses for agricultural yield prediction. For a smoothed SHAP curve, the first derivative reflects the rate of change in the SHAP value relative to the feature value. Therefore, building upon the Zero-Crossing Threshold (ZCT) analysis, a novel threshold identification method termed the Derivative Extrema Threshold (DET) was introduced. The DET was defined as the point where the first derivative attained an extremum (either a maximum or a minimum). This point corresponded to the position of the most rapid slope change, indicating the region where the SHAP value was most sensitive to variations in the feature value.

For a smoothed SHAP dependence curve, its first derivative reflects the rate of change in the SHAP value with respect to the feature value. The Derivative Extrema Threshold (DET) is defined as the point where the first derivative attains a local extremum (maximum or minimum). Mathematically, this corresponds to the inflection point of the original SHAP curve, satisfying:

\frac{d^{2} ϕ_{s m o o t h}}{d x^{2}} = 0 and \frac{d^{3} ϕ_{s m o o t h}}{d x^{3}} \neq 0

(2)

where ϕ_smooth(x) is the smoothed SHAP function. It is crucial to clarify that DET does not identify the point where the slope changes fastest (which would be an extremum of the second derivative, i.e., maximum curvature). Instead, DET pinpoints the location where the effect intensity (the SHAP value itself) changes at its maximum rate. This threshold is applicable to scenarios where the direction of the effect remains unchanged, but the intensity undergoes a qualitative shift (e.g., a yield-increase effect transitioning from rapid acceleration to saturation).

The widely used Cubic Spline Interpolation was applied to fit the smoothed curve (smoothing parameter s = 0). The first derivative of the spline function was then computed. Dense sampling was performed using 500 equally spaced points within the range of the feature values. The DET points were identified via the local extrema of this first derivative sequence, indicating the range where the predicted unit rice yield shifted from a yield-increase dominance (ϕ > 0) to a yield-decrease dominance (ϕ < 0).

Following the acquisition of DET candidate points, a specific constraint was required. Although DET points situated near the ZCT possessed clear agrometeorological interpretability regarding boundary effects, the algorithm merely captured the top two extreme points ranked by absolute derivative values. This characteristic frequently caused the selected points to reside in the extreme tails of the data distribution, where observations were exceptionally sparse (e.g., regions of maximum precipitation). Consequently, these points degraded into extrapolation artifacts generated by the spline interpolation, losing their actual physical significance. To address this issue, a spatial constraint criterion based on the boxplot principle was introduced. A boundary of 1.5 times the interquartile range (IQR) was established. Specifically, the interval [Q1 − 1.5 × IQR, Q3 + 1.5 × IQR] for each feature value in the training set was defined as the valid boundary. Any DET candidate points falling outside this range were discarded. This procedure effectively ensured that the retained DET points were located within normal intervals characterized by sufficient data density. It thereby avoided pseudo-threshold interference caused by model extrapolation into sparse regions.

To quantify the uncertainty of the threshold estimations, a Bootstrap resampling method (n = 500) was employed to calculate the 95% confidence interval for each threshold. For simplicity of presentation, we retain the term “95% confidence interval” (CI) throughout this paper; however, it must be strictly clarified that this refers to the bootstrap interval, reflecting the stability of the model-identified thresholds under sample perturbations, rather than a classical confidence interval for an unknown population parameter. During each resampling iteration, a dataset equal in size to the original sample was drawn with replacement from the original training set. The dependence plots, smoothed curves, and the two aforementioned thresholds were recalculated, and the threshold estimates from all Bootstrap samples were recorded.

Due to the substantial variations in the data ranges of different features, a unified absolute width standard could not be applied. A relative width indicator, defined as the ratio of the confidence interval width to the feature value range (W/R), was adopted to evaluate the reliability of the thresholds:

\frac{W}{R} = \frac{U - L}{\max (X_{j}) - \min (X_{j})}

(3)

Here, U and L represented the upper and lower limits of the confidence interval, respectively. max(X_j) and min(X_j) denote the maximum and minimum values of feature X_j in the training set, respectively. As noted above, this width does not represent the precision of a population parameter estimate, but rather directly reflects the clarity of the turning point within the data: a narrower interval indicates a more stable, explicitly defined response reversal in the observed data, whereas a wider interval implies a relatively gentle effect transition zone or a limitation imposed by local sample sparsity. It must be emphasized that SHAP values quantify the marginal contribution of features to the model’s prediction rather than establishing causal physiological mechanisms. The identified thresholds and interactions reflect conditional response patterns of the constructed model under the specific observational data distribution.

2.5. Factor Interaction Effect Analysis Based on SHAP

SHAP can quantify not only the marginal contribution of a single feature to the model prediction but also the joint impact of interactions between two features via SHAP interaction values [25]. Unlike the standard SHAP value for a single feature, which reflects only its independent contribution, the SHAP interaction values decompose the prediction value into the sum of the main effects of individual features and the interaction effects of feature pairs.

According to the SHAP framework extension, for any feature i and feature j (i ≠ j), the SHAP interaction value Φ_i_,j measures the contribution of the interaction between feature i and feature j to the model prediction. The calculation formula is:

Φ_{i, j} = \sum_{S \subseteq N ∖ {i, j}} \frac{| S |! (M - | S | - 2)!}{2 (M - 1)!} \nabla_{i, j} (S)

(4)

Here, N represents the set of all features, M is the total number of features, and ∇_i_,j(S) denotes the interactive marginal contribution of features i and j on subset S:

\nabla_{i, j} (S) = [f (S \cup {i, j}) - f (S \cup {j})] - [f (S \cup {i}) - f (S)]

(5)

The intuitive meaning of this formula is as follows. The interaction effect of features i and j equals the marginal gain from adding both features to the model simultaneously. This value is then subtracted by the sum of the marginal gains from adding each feature individually. If ∇_i_,j(S) > 0, a synergistic effect exists between features i and j. Their simultaneous presence yields a positive contribution to the yield that exceeds the sum of their individual contributions. If ∇_i_,j(S) < 0, an antagonistic effect is indicated. The SHAP interaction values satisfy the symmetry property, meaning Φ_i_,j = Φ_j_,i. Consequently, the total interaction effect for any feature pair is Φ_i_,j + Φ_j_,i = 2Φ_i_,j.

For each feature, its main effect can be derived by subtracting all interaction effects from its SHAP value, where ϕ_i is the SHAP value of feature i. The sum of the main effects and interaction effects across all features equals the total deviation of the model prediction value from the baseline value. The formula is expressed as:

Φ_{i, i} = ϕ_{i} - \sum_{j \neq i} Φ_{i, j}

(6)

Traditional SHAP analysis assigns an importance score to each feature (e.g., mean absolute SHAP value). This score aggregates the feature’s own main effect and its interaction effects with all other features. However, this aggregated representation can be misleading. When the main effect of a feature and its interaction effects act in opposing directions, partial cancelation occurs during the traditional SHAP aggregation. This phenomenon causes the feature to exhibit a falsely low importance. Furthermore, the absolute magnitude of traditional SHAP interaction values is constrained by the feature’s inherent dimensions and data scale. Thus, it cannot serve as a universal comparison benchmark across different feature pairs.

To deeply deconstruct the interpretable components within traditional SHAP values and characterize the relative strength of interaction effects at the feature pair level, the dimensional interference had to be eliminated. To address the methodological bottleneck of lacking normalized quantitative benchmarks and grading standards, the Interaction Dominance Ratio (IDR) was proposed. The core design of the IDR was to evaluate the “interaction variation” within a relative framework of “total effect variation.” For any feature pair (i, j), the IDR calculation formula was defined as:

I D R_{i j} = \frac{P_{90} (Φ_{i j}) - P_{10} (Φ_{i j})}{σ (Φ_{i i} + Φ_{j j} + Φ_{i j})}

(7)

Regarding this formula, the standard deviation is highly sensitive to extreme outlier samples. In contrast, the adoption of the percentile range (P₉₀–P₁₀) effectively truncated the influence of outliers induced by extreme climate disturbances. This metric focused on the interaction variation span of the central 80% of typical samples. It thereby robustly characterized the true interaction dominance degree of feature pairs under typical meteorological conditions.

In this equation, the numerator characterizes the discrete span (P₉₀–P₁₀) of the pure interaction effect value Φ_ij across different samples. This reflected the variability of the interaction effect among observation points. The denominator characterized the overall discrete degree (standard deviation) of the total combined effect for the feature pair. This total combined effect was the sum of the pure main effects Φ_ii and Φ_jj and the pure interaction effect Φ_ij. By constructing this dimensionless ratio, the IDR achieved a “de-scaled” comparison of interaction strengths across feature pairs with different dimensions and magnitudes. This provided a specific mathematical implementation for the “horizontal comparison ruler” that was missing in previous studies.

In statistics and various applied fields, the effect size interpretation standards proposed by Jacob Cohen are widely accepted. Cohen categorized the magnitude of the Pearson correlation coefficient r into three levels: “small,” “medium,” and “large,” with corresponding thresholds of r ≈ 0.10, r ≈ 0.30, and r ≈ 0.50, respectively [25]. These thresholds have been validated and adopted in fields such as agriculture, ecology, and social sciences [26,27]. The IDR, as a dimensionless ratio measuring the relative dominance of interaction variation within total effect variation, shares a similar interpretive logic with the correlation coefficient: both quantify the proportion of explained variance attributable to a specific source. Accordingly, adapting Cohen’s benchmarks to the IDR context provides a principled, albeit approximate, basis for grading interaction dominance (Table 1). An IDR ≥ 0.50 is defined as “strong interaction dominance,” indicating that the interaction variation constitutes the majority of the total effect variation. An IDR between 0.30 and 0.50 is defined as “moderate interaction dominance,” and an IDR < 0.30 as “weak interaction dominance.” While we acknowledge that the direct mapping from Cohen’s r thresholds to IDR values involves an analogical extension rather than a formal statistical equivalence, this adaptation offers a transparent and reproducible grading framework where none previously existed. Future work with larger datasets may enable empirical calibration of these boundaries through receiver operating characteristic analysis or domain-expert elicitation.

Furthermore, to avoid the potential misjudgment of “high interaction ratio but low absolute contribution” that could arise from relying on a single IDR metric, a two-dimensional classification paradigm space was constructed. The “absolute interaction amplitude” served as the horizontal axis, and the IDR served as the vertical axis. Through median splits, this space mapped all feature pairs into four functional prototypes. These included interaction-dominated with high amplitude (upper right: strong synergistic/antagonistic pairs with genuine agronomic intervention value), interaction-dominated with low amplitude (upper left: an interaction mechanism exists but the absolute contribution is limited), main-effect-dominated with high amplitude (lower right: core driving factors in traditional understanding), and low-contribution feature pairs (lower left). This framework elevated the analytical perspective of the study from traditional “feature importance ranking” to “feature pair action mechanism tracing.” It clearly distinguished between two fundamentally different driving logics: “important due to inherent strength” and “important due to intense interaction,” Consequently, it provided a direct quantitative basis for formulating multi-factor synergistic regulation strategies.

2.6. Model Evaluation Metrics

To objectively evaluate the performance of the rice Meteorological Yield prediction models, three commonly utilized statistical indicators were selected for quantitative assessment. These included the coefficient of determination (R-squared, R²), the Root Mean Squared Error (RMSE), and the Mean Absolute Error (MAE).

3. Results

3.1. Initial Data Analysis

Table 2 summarizes the descriptive statistics of the rice unit yield variables and the 11 optimized meteorological features across all 270 district-year observations (9 districts × 30 years). The actual yield ranged from 4.08 to 8.86 t·hm⁻² with a mean of 6.77 t·hm⁻² and a CV of 10.86%. The trend yield exhibited substantially lower variability (CV = 7.35%), reflecting its smooth trajectory driven by technological and management improvements. In contrast, the meteorological yield, defined as the residual component after HP filter detrending (λ = 100), displayed a standard deviation of 0.50 t·hm⁻² and ranged from −2.12 to 1.49 t·hm⁻², indicating that inter-annual weather fluctuations can induce yield deviations exceeding ±1.0 t·hm⁻² in extreme years. Among the temperature features, TAVG_8-9 exhibited the lowest CV (2.55%), suggesting relatively stable thermal conditions during the heading and grain-filling period, whereas TAVG_3 showed the highest CV among temperature variables (8.89%), consistent with the high variability of spring weather in the subtropical monsoon zone. Precipitation features demonstrated markedly greater variability, with CVs ranging from 16.5% (R2020_3-6) to 46.4% (R2020_10); this contrast underscores that temperature primarily sets the baseline growing conditions, whereas precipitation constitutes the principal source of inter-annual yield fluctuation—a asymmetry that also explains why temperature-derived thresholds identified later tend to exhibit sharper confidence intervals than precipitation-derived ones.

Figure 2 illustrates the HP filter decomposition of the regional average rice yield from 1995 to 2024. Panel (a) overlays the actual yield and the extracted trend yield. The trend yield exhibited a gradual increase from approximately 6.0 t·hm⁻² in 1995 to a plateau of approximately 7.0 t·hm⁻² around 2016–2020, consistent with the widely documented pattern of diminishing marginal returns from technological inputs. Panel (b) presents the meteorological yield as a bar chart, with positive values (blue) indicating weather-favorable years and negative values (red) indicating weather-induced yield reductions. Pronounced negative departures were concentrated in 2018 and 2019, corresponding to compound extreme events of high temperature and drought that severely affected the Ningbo region; the most substantial positive deviation occurred in 2016. The asymmetric magnitude of negative versus positive anomalies—extreme deficit years produced larger absolute deviations than surplus years—suggests that meteorological disasters exert a disproportionate impact on yield relative to the marginal gains from favorable weather.

Table 3 presents the Pearson correlation coefficients between the 11 meteorological features and the meteorological yield. Notably, all simple linear correlations were weak (|r| < 0.08) and statistically non-significant (p > 0.05), indicating that the relationships between individual meteorological factors and yield cannot be adequately captured by linear metrics—a finding that directly motivates the adoption of nonlinear machine learning models in subsequent analyses. Figure 3a presents the full Pearson correlation heatmap among the 11 features and the meteorological yield. Moderate-to-strong collinearity was observed among features spanning overlapping time periods within the same driver category (e.g., TAVG_3-5 vs. TAVG_3; R2020_3-6 vs. R2020_4-7), which motivated the LASSO-based feature screening described in Section 2.2. In contrast, cross-category correlations between temperature and precipitation features were generally weak (|r| < 0.3 for most pairs), suggesting that thermal and hydrological drivers operate through relatively independent pathways—a precondition that gives physical interpretability to the interaction dominance identified by IDR in Section 3.4. Figure 3b displays the standardized distributions of the 11 features as box plots, where temperature features (red boxes) exhibited compact, symmetric distributions while precipitation features (blue boxes) showed pronounced right skewness and larger interquartile ranges, reflecting the sporadic and intense nature of monsoon-driven rainfall events.

3.2. Comparison of Predictive Performance Across Different Yield Models

The performance of the six prediction models on the testing set is illustrated in Figure 4. LightGBM achieved optimal performance, with a coefficient of determination (R²) of 0.809, a root mean square error (RMSE) of 0.352 t·hm⁻², and a mean absolute error (MAE) of 0.258 t·hm⁻². Support Vector Regression (SVR) ranked second (R² = 0.774). Bagged Trees and Random Forest exhibited comparable performance. The Back Propagation Neural Network (R² = 0.709) and Multiple Linear Regression (MLR, R² = 0.641) demonstrated relatively lower accuracy. Ensemble tree models, particularly the gradient boosting framework, demonstrated distinct advantages in handling the nonlinear relationships between meteorological factors and rice yield. Conversely, linear models were constrained by linear assumptions and failed to capture complex nonlinear effects. Consequently, LightGBM was selected for subsequent analyses.

3.3. Model Optimization and Meteorological Factor Contributions

Using LightGBM as the baseline model, Recursive Feature Elimination (RFE) based on SHAP feature importance was conducted. This procedure aimed to analyze and extract more critical factors. It simplified model inputs while achieving further improvements in predictive performance. Initially, the model was trained using all 15 features selected by LASSO. The mean absolute SHAP value for each feature was calculated to establish a feature importance ranking. Subsequently, starting from the current feature set, the feature with the lowest importance (i.e., the smallest mean absolute SHAP value) was sequentially removed. The model was retrained using the remaining features. The R², RMSE, and MAE were calculated on the testing set to monitor changes in these metrics. This elimination process was repeated until the number of features was reduced to a predefined minimum of 5. Upon completing the traversal of all feature quantities, the feature subset corresponding to the peak testing set performance was designated as the optimal subset. If the R² difference among multiple feature quantities was less than 0.005, the subset with fewer features was preferred to maintain model parsimony. The elimination process is illustrated in Figure 5. Given the continuous decline in R² thereafter, only the progression from 15 down to 5 features is displayed.

Based on the 5-fold Time Series Split CV within the training set, the mean CV R2 peaked when the number of features was reduced to 11. When the feature count was further reduced to 10 or fewer, the CV R2 began to decline, indicating the onset of key information loss. Consequently, the 11-feature subset was confirmed as the optimal configuration. The final optimized LightGBM model, trained on the full 1995–2018 dataset with these 11 features, achieved an R2 of 0.833, an RMSE of 0.330 t·hm⁻², and an MAE of 0.252 t·hm⁻² on the independent 2019–2024 test set. Compared to the full 15-feature baseline model, the optimized model demonstrated improved generalization capability by eliminating redundant features. The advantage of this methodology is that the sequence of feature elimination is entirely dictated by the model’s objective evaluation of feature importance, avoiding subjective human presets. Furthermore, retraining the model and evaluating its performance after each removal ensures the validity and robustness of the selected feature subset. The SHAP beeswarm plot (Figure 6) intuitively illustrates the interpretable relationships between SHAP and the features. It displays the distribution of SHAP values for the feature variables, revealing the underlying associations between the factors and their model impacts. The left panel presents the SHAP beeswarm plot for the 15 features in the original LightGBM model. The features are arranged in descending order based on their mean absolute SHAP values. The top five features were TAVG_3, R2020_8, TAVG_6, TAVG_8-9, and TMIN_6, with mean absolute SHAP values of 0.108, 0.101, 0.074, 0.064, and 0.046, respectively. This indicated that the March average temperature and August precipitation were the primary factors influencing rice yield predictions in Ningbo. Regarding the direction of influence, high-value samples (red points) for TAVG_3 and TAVG_6 were predominantly distributed in the positive SHAP region. High-value samples for R2020_8 were distributed across both the positive and negative sides. Conversely, low-value samples for TMIN_6 were mainly located in the negative SHAP region. The bottom four features in the importance ranking exhibited mean absolute SHAP values below 0.032, indicating a negligible marginal contribution to the predictions.

Following the Recursive Feature Elimination, the right panel displays the SHAP beeswarm plot for the remaining 11 features. Compared to the pre-optimization state, the importance ranking underwent significant shifts. R2020_8 ascended to the first position (mean absolute SHAP = 0.118), while TAVG_3 dropped to the second position (0.089), and TMIN_6 rose to the third position (0.074). The four eliminated features (TAVG_10, WSMAX_9, R2020_4-5, and UAVG_9) were the least important among the initial 15 features. After their removal, the mean absolute SHAP values of the remaining features generally increased (e.g., R2020_8 increased from 0.101 to 0.118, and TMIN_6 from 0.046 to 0.074). This indicated that the model reallocated the explanatory power previously distributed among redundant features toward the core features. Consequently, the model R² improved from 0.809 to 0.833. Simultaneously, the scatter distributions for R2020_8 and TAVG_3 maintained a wide span, continuing to dominate the model predictions. The relative importance of TAVG_6 declined (from third to fifth), whereas the importance of TMIN_6 increased significantly. Overall, the optimized feature subset was more parsimonious. The influence of the core features became more pronounced, and the interpretability of the model was concurrently enhanced.

Figure 7 presents a pie chart illustrating the relative contribution rates of the 11 screened features within the optimal LightGBM model. The percentage of each feature’s mean absolute SHAP value relative to the total sum reflects its relative importance to the model predictions. As observed in the figure, August precipitation contributed the most, accounting for 17.4% and establishing it as the primary factor influencing rice yield predictions in Ningbo. March average temperature followed at 13.1%. The June average minimum temperature and the August–September average temperature accounted for 10.9% and 11.0%, respectively, ranking third and fourth. These four features cumulatively contributed nearly 51.5%, forming the core driving force of the model predictions. The remaining seven features accounted for 48.5%. Overall, precipitation factors (particularly August precipitation) and temperature factors from spring to early summer jointly dominated the model predictions for rice yield in Ningbo. In contrast, the contributions of precipitation and temperature factors during the spring and autumn seasons were relatively limited. Wind speed and relative humidity contributed negligibly due to the model optimization process. This pie chart quantitatively delineates the weight distribution of individual meteorological factors in yield prediction from the perspective of SHAP interpretability. It provides an intuitive basis for comprehending the model’s decision-making mechanisms.

3.4. Meteorological Threshold Identification for Rice Yield

3.4.1. ZCT Identification Results

SHAP dependence plots serve as essential tools for revealing the nonlinear relationships between individual features and model predictions. The Zero-Crossing Threshold (ZCT) represents the critical point at which the model prediction shifts from a yield-increase dominance (SHAP > 0) to a yield-decrease dominance (SHAP < 0). Thresholds for the 11 features, identified through secondary SHAP screening, were calculated individually. The specific results are presented in Table 4.

The SHAP dependence plots for each meteorological factor were generated, with the results displayed in Figure 8. The horizontal axis of a dependence plot represents the actual observed value of the feature, while the vertical axis represents its corresponding SHAP value. Each point denotes a training sample. The blue line segment represents the 95% confidence interval for the feature, serving as the threshold interval. The purple horizontal marker indicates the specific threshold point. By analyzing the trend of SHAP values relative to feature values, the nonlinear response patterns of the factors on rice yield predictions could be identified. This was particularly applicable for isolating the critical points where SHAP values shifted from positive to negative (indicating a transition from yield increase to yield decrease) or where the slope exhibited significant changes. For all thresholds, the ratio of the confidence interval width to the feature value range was less than 14%, indicating relatively high statistical reliability.

In the design of the threshold identification algorithm, the ZCT function traverses the smoothed SHAP dependence curve from left to right. It locates the feature value coordinate where the SHAP value first transitions from negative to positive (or vice versa). It should be noted that, theoretically, this algorithm possesses the capability to capture multi-threshold structures (i.e., multiple zero crossings). However, based on the actual execution results of this study, the SHAP dependence curves for all 11 meteorological factors exhibited typical single zero-crossing behavior. No complex multi-threshold structures were observed. This indicates that the transition of each meteorological factor from inhibiting (or promoting) rice yield to promoting (or inhibiting) yield is governed by a single dominant turning point. A very limited number of factors exhibited brief SHAP value oscillations after crossing the zero-crossing point. This phenomenon may reflect response instabilities within extreme value regions.

Regarding the confidence interval widths, the threshold certainty of the 11 factors demonstrated distinct differentiation characteristics. These could be broadly categorized into three groups:

(1) Narrow Transition Zone factors (CI width < 1): This group included TMIN_6 (0.05 °C), TAVG_6 (0.10 °C), TAVG_8-9 (0.10 °C), TAVG_3 (0.13 °C), and TAVG_6-7 (0.64 °C). Notably, the interval width for TMIN_6 was merely 0.05 °C, the narrowest among all features. This indicated that the model’s determination of the sign reversal for this feature possessed extremely high statistical stability. The CI widths for both TAVG_6 and TAVG_8-9 were constrained within 0.1 °C.

(2) Moderate transition zone factors (1 ≤ CI width < 10): This group comprised R2020_3-6 (5.38 mm) and R2020_10 (4.24 mm).

(3) Wide Transition Zone factors (CI width ≥ 10): This group consisted of R2020_8 (10.10 mm), R2020_4-7 (26.08 mm), and R2020_5 (21.07 mm), with the former exhibiting the widest CI among all features.

In the plots, the Narrow Transition Zone temperature factors manifested as an extremely narrow blue band nearly overlapping the threshold. Conversely, the Wide Transition Zone precipitation factors appeared as expansive intervals spanning tens of millimeters.

3.4.2. DET Identification Results

Building upon the Zero-Crossing Threshold (ZCT) analysis, the Derivative Extrema Threshold (DET) method was employed to further extract the top two first-derivative extremum points for each factor, following Interquartile Range (IQR) filtering. A total of 66 candidate extremum points were evaluated across the 11 factors. Among these, 65 points fell within the valid IQR boundaries, whereas only one point was filtered out (the Rank 3 candidate for TAVG_3-5, which was below the IQR lower bound of 14.15 °C). The results are presented in Figure 9. Green (up), red (down), and gray (uncertain) vertical dashed lines denoted the DET points where the SHAP values transitioned from negative to positive, from positive to negative, or remained unchanged in direction, respectively. These lines intuitively indicated the three directions of effect intensity variations.

Compared to the absolute reversal nature of the ZCT, the DET revealed the dynamic evolution process of the marginal effects of meteorological factors on yield. Based on the spatial positional relationships between the two thresholds, the factors exhibited the following patterns:

(1) Factors with DET points highly overlapping the ZCT: The two DET points for TAVG_3_5 (16.80 °C and 16.81 °C) nearly perfectly overlapped with the ZCT (16.81 °C). The deviations from the zero-crossing point were merely −0.01 °C and −0.00 °C, respectively. This indicated an almost non-existent buffer space. Both points demonstrated a “down” direction, signifying an effect weakening. The two DET points for TAVG_6 were located 0.29 °C and 0.84 °C to the right of the ZCT, respectively. They exhibited “down” and “up” directions.

(2) Factors with DET points located on the same side of the ZCT: The two DET points for TAVG_8-9 (26.86–26.88 °C) were positioned approximately 0.67–0.68 °C to the right of the ZCT (26.19 °C). Both points showed a “down” direction. Their local SHAP means were significantly negative (−0.049 to −0.051), representing a typical accelerated deterioration zone. R2020_5 was the only precipitation factor with clear agricultural hydrological significance. Its two DET points (65.0–79.5 mm) were situated 33–48 mm to the left of the ZCT (113.1 mm). The Rank 1 point (79.5 mm) showed an “up” direction, with a local SHAP mean of −0.030 (the proportion of positive values was only 11.1%). The DET points for R2020_8 were located to the right of the ZCT (219.9–221.1 mm). Both pointed in the “down” direction, which was considered relatively reliable. The two DET points for R2020_3-6 (577.5–580.0 mm) were located 126–129 mm to the right of the ZCT (451.3 mm), both in the “up” direction. The two DET points for R2020_4-7 were also located to the right of the ZCT (37–44 mm), displaying “up” and “uncertain” directions, respectively. TAVG_6 and TAVG_6_7 exhibited highly complex nonlinear responses. Their primary DET points (25.36 °C and 26.35 °C) were located to the right of their respective ZCTs (24.5 °C and 26.1 °C). Meanwhile, the local SHAP means remained weakly positive (0.028–0.047).

However, the two DET points for R2020_10 (3.1–4.2 mm) were located far too the left of the ZCT (83.3 mm), at a distance of approximately 79–80 mm. Both were classified as “uncertain.” Although the DET points for R2020_3-6 and R2020_4-7 passed the IQR filtering, they were situated 37–129 mm to the right of the ZCTs. These points were distant from the ZCTs and located in the medium-to-high value regions of the data distribution. All three factors displayed sign characteristics that contradicted the ZCT logic. For instance, an “enhancement” trend emerged in the precipitation surplus zone far beyond the ZCT. These phenomena primarily originated from fitting oscillations of the Cubic Spline Interpolation at sparse data boundaries. Although they passed the IQR filtering, they lacked sufficient sample density support. Therefore, they should not be directly assigned definitive agrometeorological significance. Their essence was identified as an uncertainty marker of the model in the extreme precipitation extrapolation zone.

(3) Factors with DET points distributed across both sides of the ZCT: TMIN_6 displayed a unique bilateral distribution. The point on the left side of the ZCT (21.07 °C) pointed in the “up” direction (effect enhancement). The point on the right side (22.21 °C) pointed in the “down” direction (effect weakening). These points were located −0.60 °C and +0.54 °C from the zero-crossing point, respectively. TAVG_3 also exhibited a bilateral distribution. The left point (10.31 °C) showed a “down” direction, while the right point (13.54 °C) was judged as “uncertain.” The two DET points for TAVG_6-7 were both located to the right of the ZCT (+0.26 °C and +0.61 °C), displaying “up” and “down” directions, respectively.

3.5. Factor Interaction Effect Analysis

Effect decomposition was initially conducted at the individual feature level utilizing the SHAP interaction matrix (Φ_ij) output by the LightGBM model. For each feature i, the SHAP value was decomposed into the sum of the pure main effect (Φ_ii) and all pure interaction effects (Φij, where j ≠ i). The mean absolute values of the pure main effects and the pure interaction effects were calculated separately for each feature. The relative proportions of these two components revealed the internal structure of feature importance (Figure 10). The results demonstrated that the proportion of pure main effects for the 11 optimized meteorological factors ranged from 76.99% (TAVG_6_7) to 98.06% (TMIN_6). This indicated that traditional SHAP values were generally dominated by the independent contributions of individual features. However, the proportions of pure interaction effects exhibited significant variations among the features. TAVG_6_7 showed the highest proportion (15.07%), followed by R2020_5 (11.90%) and TMIN_6 (11.10%). It must be noted that a higher interaction proportion does not equate to significant effect cancelation. Cancellation occurs exclusively when the main effect and the interaction effect act in opposite directions, leading to mutual offsetting during the aggregation process. TMIN_6 and R2020_8 exhibited the most prominent effect cancellation, with proportions reaching 9.16% and 5.58%, respectively. This implied that the traditional SHAP importance of these two features was underestimated by a corresponding magnitude. TAVG_8-9 also demonstrated a 2.27% cancellation. These results suggested that for certain key meteorological factors, relying solely on traditional mean absolute SHAP values for importance ranking might underestimate their true potential contributions. This underestimation stems from the directional cancellation between internal main effects and interaction effects.

Examining the interaction structure of feature pairs from a global perspective, the absolute interaction strength matrix (Figure 11a) revealed that among the 55 feature pairs, the top four in absolute interaction strength were TMIN_6 × R2020_8 (0.0303), TAVG_8_9 × R2020_8 (0.0239), TAVG_6_7 × R2020_8 (0.0184), and TAVG_8_9 × TMIN_6 (0.0172). All of these pairs involved three factors: the average temperature during the Heading and Grain-Filling Stage (TAVG_8_9), the June minimum temperature (TMIN_6), or the August precipitation (R2020_8). However, absolute strength alone cannot distinguish between two fundamentally different scenarios. These include the spurious inflation of interaction due to inherently strong main effects of the features, versus genuinely high interaction resulting from severe intrinsic interaction variability. The Interaction Dominance Ratio (IDR) matrix (Figure 11b) provided the basis for this distinction. Among the four aforementioned pairs, the IDR values for TMIN_6 × R2020_8, TAVG_8_9 × R2020_8, and TAVG_8_9 × TMIN_6 reached 0.525, 0.549, and 0.622, respectively. All three crossed into the strong interaction dominance interval (IDR ≥ 0.50). Conversely, the IDR for TAVG_6_7 × R2020_8 was only 0.390, classifying it as moderate interaction dominance. This indicated that its higher absolute interaction strength primarily stemmed from the larger main effect baselines of the two features themselves, rather than the relative dominance of interaction variability.

To establish a unified comparison benchmark across feature pairs, all 55 pairs were quantitatively graded based on their IDR values according to Table 1. The results (Figure 12) indicated that only 3 feature pairs (5.5%) were classified as strong interaction dominance. Additionally, 6 pairs (10.9%) were classified as moderate interaction dominance, and 46 pairs (83.6%) were classified as weak interaction dominance. This distribution demonstrated that the meteorological drivers of rice yield were predominantly independent, driven by main effects. Nevertheless, a select few factor combinations possessed substantial interactive dominance capabilities. Further analysis revealed that the three strongly interacting pairs precisely formed a closed triangular network around TAVG_8_9, TMIN_6, and R2020_8.

Within the two-dimensional IDR paradigm space (Figure 12), the TAVG_8_TMIN_6 pair from the aforementioned triangular network fell into the upper-right quadrant (high Interaction Dominance Ratio and high amplitude; IDR = 0.622, amplitude = 0.062). It was the only feature pair that simultaneously satisfied the criteria of “strong interaction variability” and “large absolute contribution.” This represented the most agronomically significant synergistic combination. In contrast, the TAVG_6_7 × R2020_8 pair, which fell into the lower-right quadrant (IDR = 0.390, amplitude = 0.056), ranked third in absolute interaction strength among all pairs. However, its low IDR value revealed a relative decoupling in their physical mechanisms. The temperature from June to July primarily influenced early processes such as tillering and young panicle differentiation. Conversely, August precipitation mainly affected the subsequent grain filling. A natural temporal mismatch existed between the two factors across phenological stages. Consequently, they drove the yield through their respective independent main effects, lacking conditional dependence.

Figure 13 further illustrated these mechanistic differences through three-dimensional interaction surfaces. For the TAVG_8_9 × TMIN_6 pair in the upper-right quadrant (Figure 13a), the interaction surface exhibited significant nonlinear distortions and fluctuations under various temperature and humidity combinations. This represented the mathematical mapping of the preceding temperature background modulating the sensitivity to later thermal stress. For the TAVG_6_7 × R2020_8 pair in the lower-right quadrant (Figure 13b), the surface was relatively smooth overall. The fluctuations induced by the interaction were superimposed upon a steep main effect baseline. This confirmed the main-effect-dominant mode resulting from the temporal mismatch. For the TMIN_6 × R2020_4-7 pair in the lower-left quadrant (Figure 13c), the surface tended to be flat, verifying its weak interactive contribution. The morphological differences among these three surfaces intuitively demonstrated that the IDR could effectively discriminate factor combinations possessing genuine phenological coupling significance.

These findings hold direct guiding implications for agrometeorological disaster early warning. Traditional single-factor threshold warnings (e.g., “daily maximum temperature exceeding 35 °C” or “continuous 3-day precipitation exceeding XX mm”) pose a systematic underestimation risk when confronted with the Interaction Dominance Triangular Network formed by TAVG_8_9, TMIN_6, and R2020_8. For such high-IDR factors, consideration should be given to establishing a Composite Meteorological Index based on multi-factor combinations. For instance, introducing a preceding June temperature anomaly correction term into the August high-temperature warning could capture the nonlinear yield losses induced by this interaction dominance.

4. Discussion

4.1. Our Findings Based on SHAP and ZCT

Based on the dual-indicator detection system constructed from the SHAP dependence curves, the response transition characteristics of all 11 optimized meteorological factors were successfully extracted. Each factor exhibited a single Zero-Crossing Threshold (ZCT) point, signifying a monotonic reversal in the direction of its marginal contribution. A further comparison of the uncertainty spans (the ratio of the confidence interval width to the feature value range) across the factors revealed significant heterogeneity in their response patterns. For instance, the spans for the June average temperature and the June average minimum temperature were extremely narrow (<0.1 °C). These factors displayed an abrupt response reversal with a distinct boundary, indicating high statistical stability at their threshold points. Conversely, the span for the April–July cumulative precipitation was relatively wide (reaching 26.0 mm). The SHAP curve for this factor exhibited a gentle slope near the zero-crossing point, reflecting a gradual transition driven by the water regulation capacity of the soil-crop system.

This heterogeneity provides a differentiated perspective for the refined early warning of agrometeorological disasters. For factors characterized by a Narrow Transition Zone, an explicit single early warning threshold can be established. For factors within a Wide Transition Zone, it is necessary to consider implementing a “transition early warning interval” rather than relying exclusively on a single numerical value. Under this theoretical framework, the meteorological thresholds identified in this study possess explicit agronomic validation value:

(1) Validation of the sowing period temperature index: The ZCT for the March average temperature was 11.6 °C, with an extremely narrow interval. Model predictions indicated yield reductions below this value, whereas yield-increase effects emerged above it. This threshold aligns closely with local agrometeorological practices for early rice sowing in the Ningbo region. Local agrometeorological observations indicate that a stable March average temperature passing through 10–12 °C serves as the primary basis for determining the safe sowing period. Temperatures below this limit frequently lead to bud rot and seedling death [28].

(2) Validation of the composite disaster index during the key growth period: The ZCT for the August–September average temperature was 26.2 °C, and the ZCT for the August cumulative precipitation was 210.6 mm. These two factors precisely formed the core nodes of the Interaction Dominance Triangular Network identified through Interaction Dominance Ratio (IDR) analysis. Local disaster records provided direct physical validation for these two thresholds. Previous research demonstrates that a daily average temperature ≥ 27 °C during the Heading and Grain-Filling Stage of single-season rice in Ningbo can induce significant heat damage. This damage subsequently reduces the seed-setting rate and the 1000-grain weight. Furthermore, August is characterized by frequent typhoons. When monthly precipitation exceeds 200 mm, continuous field waterlogging readily occurs. The superimposition of high temperatures generates a composite stress involving High-Temperature Induced Premature Maturity and overcast, rainy conditions with insufficient sunlight. Under such conditions, yield losses are severely amplified [29,30]. The thresholds of 26.2 °C and 210.6 mm identified by the model fall precisely within the risk intervals confirmed by local agrometeorological experiments. This confirms that the “ZCT-DET-IDR” framework captures actual disaster physical processes rather than statistical artifacts.

(3) Validation of the moisture management index during the grain-filling period: The ZCT for the October cumulative precipitation was 83.3 mm. The grain-filling period for late rice in Ningbo coincides with the peak season for continuous autumn rains. Local field experiments and disaster surveys indicate that when October cumulative precipitation consistently exceeds 80 mm, root hypoxia and premature leaf senescence occur. This significantly inhibits substance transport and grain filling [31]. This explicit threshold is highly consistent with the model outputs, highlighting the need for meticulous drainage and waterlogging mitigation management during the late rice grain-filling period.

4.2. What Role Did DET Play as a Complement to ZCT?

Building upon the absolute reversals identified by the ZCT, the Derivative Extrema Threshold (DET) method further revealed the dynamic evolution of marginal effects. As previously noted, the DET point for the August–September average temperature (26.86 °C) was located to the right of the ZCT and exhibited a significant negative value. This represented a typical deterioration acceleration zone. It indicated that once the high-temperature threshold is breached during the grain-filling stage, negative effects escalate sharply without buffer. Conversely, the negative effect weakening point detected for the May cumulative precipitation to the left of the ZCT (64.9–79.5 mm) accurately mapped a drought mitigation acceleration zone during the transplanting and returning green stage. It must be emphasized that the DET points for certain long-period cumulative precipitation variables (e.g., March–June cumulative precipitation) fell into extreme tails characterized by sparse data. These points essentially represent uncertainty identifiers of model extrapolation and should not be directly assigned agronomic significance. This observation highlights that when utilizing DET as a supplement to ZCT, IQR filtering must be applied to eliminate statistical artifacts.

Through effect decomposition, traditional SHAP analysis revealed significant effect cancelation for the June average minimum temperature and the August cumulative precipitation. This implies that relying solely on traditional mean absolute SHAP values underestimates their true potential. Overcoming this aggregation bias, the IDR matrix accurately characterized the Interaction Dominance Triangular Network composed of the August–September average temperature, the June average minimum temperature, and the August cumulative precipitation. From a rice phenological perspective, this network profoundly reflects the deep coupling between meteorological stress during the reproductive growth stage and the preceding climatic background of the vegetative growth stage. The high IDR (0.622) between the August–September average temperature and the June average minimum temperature indicates that the temperature effect during the Heading and Grain-Filling Stage is highly dependent on the preceding June temperature baseline. Growth delays induced by low temperatures in June alter the Phenological Window for subsequent high-temperature exposure. Additionally, the high IDR (0.549) between the August–September average temperature and the August cumulative precipitation typically mapped the nonlinear amplification effect occurring when High-Temperature Induced Premature Maturity and insufficient sunlight overlap during the heading window. This forms a closed-loop mutual corroboration with the previously mentioned local composite disaster records. These findings have paradigm-shifting implications for agrometeorological disaster early warning. Traditional single-factor threshold warnings pose a systematic underestimation risk when confronted with this interaction dominance triangle. For high-IDR factors, meteorological services should avoid issuing isolated temperature or precipitation forecasts. Instead, a Composite Meteorological Index based on multi-factor combinations should be established, such as a dynamic heat damage index incorporating preceding temperature anomaly correction terms.

To contextualize our findings within a broader geographical framework, it is essential to compare the identified meteorological thresholds and interaction mechanisms with those reported in other global rice-growing regions. For instance, our study identified a ZCT of 26.2 °C for the August–September average temperature (TAVG_8_9) in Ningbo. This aligns closely with global meta-analyses showing that critical thermal thresholds for rice during the reproductive stage typically range from 26 °C to 28 °C, beyond which heat-induced sterility and yield reduction accelerate non-linearly [32,33]. Similarly, the strong interaction dominance between high temperature and excessive precipitation (IDR = 0.549 for TAVG_8_9 × R2020_8) corroborates global observations of compound extreme events. Previous studies have demonstrated that the co-occurrence of heat stress and waterlogging amplifies yield losses far beyond the sum of their individual effects—a phenomenon increasingly recognized in monsoon-affected regions of South and Southeast Asia [34]. The cross-regional consistency of these thresholds and interactions confirms that the physical mechanisms captured by our model are not localized statistical artifacts, but rather reflect fundamental physiological responses of rice to climatic stress.

Regarding methodological advancements, the proposed ZCT-DET-IDR framework demonstrates distinct advantages over conventional interpretable machine learning approaches, though it is not without limitations. Traditional Partial Dependence Plots (PDP) and Accumulated Local Effects (ALE) plots are effective in visualizing average marginal effects but often obscure threshold behaviors by averaging across instances [9]. While standard SHAP dependence plots can reveal Zero-Crossing Thresholds (ZCT), they fundamentally fail to capture qualitative shifts in effect intensity without directional reversal—a critical gap filled by our Derivative Extrema Threshold (DET) approach. For example, in predicting crop yields under climate extremes, researchers frequently note diminishing marginal returns or effect saturation of certain agronomic factors but lacked a quantitative tool to locate the exact inflection point [3]. DET mathematically locates these inflection points via first-derivative extrema, providing an “intensity early warning” that ZCT misses. Furthermore, compared to the purely qualitative interpretation of SHAP interaction heatmaps common in existing literature, the Interaction Dominance Ratio (IDR) provides a normalized, dimensionless metric that enables cross-comparison of interaction dominance across feature pairs. Unlike the absolute SHAP interaction values—which are inherently constrained by feature scales and thus cannot differentiate between “strong interaction with weak main effect” and “weak interaction with strong main effect” [35]—IDR eliminates dimensional interference. However, a limitation of IDR is that its current grading thresholds (0.30 and 0.50) are calibrated analogously to Cohen’s effect size benchmarks. While theoretically robust and empirically functional in this dataset, these cut-offs may require regional recalibration. Future simulation studies with known ground-truth interaction strengths are necessary to validate the universality of these specific boundaries.

A fundamental caveat of this study is that the identified thresholds and interaction networks characterize the conditional response modes of the machine learning model, not direct causal crop responses. While the core thresholds (e.g., 26.2 °C for TAVG_8-9, 210.6 mm for R2020_8) demonstrate high consistency with local agrometeorological records, SHAP-based attributions are ultimately correlational. Unobserved confounding factors, such as concurrent shifts in agronomic management practices or soil properties, may partially be captured by the model and erroneously attributed to meteorological drivers. Therefore, translating these model-identified statistical transition points into standard agrometeorological management protocols requires rigorous validation through causal evidence from controlled experiments, such as artificial climate chambers or staggered planting trials.

4.3. Limitations and Future Work

Several limitations of the present study should be acknowledged:

(1) Regarding the temporal feature expansion strategy, the generation of 216 meteorological variables from only six base variables inevitably introduces multicollinearity and redundancy due to overlapping temporal windows. Although LASSO regression was employed for preliminary feature screening, LASSO tends to select one variable from a group of highly correlated variables and discard the others, which may not always retain the most agronomically meaningful representative. Future studies could consider variance inflation factor (VIF) analysis or principal component analysis (PCA) as complementary screening steps to more explicitly address multicollinearity.

(2) Regarding the robustness of the DET methodology, the DET results are sensitive to the choice of spline interpolation method, smoothing parameter, and IQR filtering criterion. The cubic spline interpolation with smoothing parameter s = 0 may produce oscillation artifacts in sparse data regions, as acknowledged in the Results section where several DET points were identified as uncertainty markers originating from spline extrapolation. Alternative smoothing strategies (e.g., LOESS, kernel regression, or different spline smoothing parameters) may yield different threshold patterns. The finding that all 11 factors exhibited single zero-crossing behavior without multiple threshold structures may also be influenced by the smoothing strategy or the relatively limited sample size. Future work should systematically evaluate the sensitivity of DET results to these methodological choices through robustness experiments with varying smoothing parameters and interpolation methods.

(3) Regarding the statistical significance of model improvements, the optimized LightGBM model achieved an R² of 0.833 after reducing the feature set from 15 to 11 variables, compared to 0.809 with the full feature set. While this improvement is consistent with the expected benefit of removing noise features, the absence of bootstrap uncertainty intervals or repeated experimental runs makes it difficult to determine whether this improvement is statistically significant or simply due to sampling variability. Similarly, the dominance of August precipitation (R2020_8) as the most influential feature has not been evaluated for stability across different training/testing partitions or alternative feature selection procedures. A stability analysis using bootstrap resampling or multiple random splits would strengthen confidence in the reported importance rankings.

(4) Regarding spatial heterogeneity, the interaction analysis identified a strong tri-angular interaction network among TAVG_8_9, TMIN_6, and R2020_8 at the aggregate level across all nine districts. However, whether these interactions are consistent across all districts within Ningbo or whether spatial heterogeneity exists has not been explored. Coastal districts (e.g., Xiangshan) and inland districts (e.g., Yuyao) may exhibit different interaction patterns due to variations in topography, soil type, and microclimate. Future work should conduct district-level interaction analyses to explore regional variability and improve the practical applicability of the framework.

(5) Regarding generalizability, the applicability of the proposed “ZCT-DET-IDR” framework beyond the Ningbo rice production system remains unclear. The framework was developed and validated using data from a single region with a specific crop type, climatic regime, and agricultural practice. External validation using independent datasets from other crops (e.g., wheat, maize) and climatic regions (e.g., arid, continental) would be necessary to demonstrate the transferability of the methodology. The absence of comparative analyses with alternative threshold identification techniques, ablation studies, or sensitivity analyses further limits the demonstration of the practical superiority or robustness of the DET and IDR methods. Nevertheless, the mathematical formulations of DET and IDR are not crop- or region-specific, and their potential applicability to other XAI interpretation scenarios based on tree models provides a reasonable basis for cautious optimism regarding cross-domain portability.

5. Conclusions

Following dual screening via LASSO and SHAP-Recursive Feature Elimination (RFE), the LightGBM model incorporating 11 core meteorological factors demonstrated optimal performance in predicting rice yield in the Ningbo region (R² = 0.833), achieving a synergistic enhancement of both prediction accuracy and model interpretability. Building upon this, the proposed Derivative Extrema Threshold (DET) successfully addressed the limitation of the traditional Zero-Crossing Threshold (ZCT) by identifying intensity mutation characteristics. Furthermore, the Interaction Dominance Ratio (IDR) facilitated the horizontal quantitative grading of interaction effects and accurately characterized the Interaction Dominance Triangular Network. Through cross-validation against local agrometeorological experiment records and disaster survey reports in Ningbo, the core thresholds extracted by the model (e.g., 11.6 °C for the March average temperature, 26.2 °C for the August–September average temperature, and 210.6 mm for the August precipitation) exhibited high consistency with the critical conditions of actual regional rice disasters. This confirmed that the “ZCT-DET-IDR” framework is capable of more than statistical inflection point detection; the identified thresholds correspond directly to disaster occurrence boundaries with distinct physical and agronomic significance. Consequently, this framework provides a decision-making foundation for the refined early warning of regional agrometeorological disasters, integrating mathematical rigor with practical applicability.

Through cross-validation against local agrometeorological experiment records and disaster survey reports in Ningbo, the core thresholds extracted by the model (e.g., 11.6 °C for the March average temperature, 26.2 °C for the August–September average temperature, and 210.6 mm for the August precipitation) exhibited high consistency with the critical conditions of actual regional rice disasters. This confirmed that the “ZCT-DET-IDR” framework is capable of more than statistical inflection point detection. The identified thresholds correspond directly to disaster occurrence boundaries with distinct physical and agronomic significance. Consequently, this framework provides a decision-making foundation for the refined early warning of regional agrometeorological disasters, integrating mathematical rigor with practical applicability. From the perspective of methodological advancement, the “ZCT-DET-IDR” framework constructed in this study possesses potential cross-domain portability. The DET addresses the challenge of identifying qualitative changes in effect intensity in the absence of a directional reversal. Concurrently, the IDR resolves the issue regarding the de-scaled comparison of interaction intensities across different feature pairs. These methodological challenges are not exclusive to rice meteorological prediction; they are prevalent in Explainable Artificial Intelligence (XAI) interpretation scenarios based on tree models. However, because the “ZCT-DET-IDR” framework explains model predictions rather than proving causal crop responses, and the IDR grading thresholds may require regional recalibration, the generalizability of these findings must be validated through controlled agronomic experiments and cross-regional datasets. Future work should focus on integrating causal inference methods and multi-source datasets to refine this framework into a standardized “threshold-interaction” dual-metric toolbox for XAI.

Author Contributions

Conceptualization, C.L.; methodology, C.L. and Z.Y.; software, C.L. and Z.Y.; validation, C.L.; formal analysis, C.L. and Z.Y.; writing—original draft preparation, C.L.; writing—review and editing, C.L. and S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are available upon reasonable request to the Ningbo Meteorological Bureau and Ningbo Municipal Statistics Bureau.

Acknowledgments

The manuscript was translated into English using the version (5.0) of GLM.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BPNN	Back Propagation Neural Network
HP	Hodrick–Prescott
IQR	Interquartile Range
MLR	Multiple Linear Regression
RBF	Radial Basis Function
RF	Random Forest
RFE	Recursive Feature Elimination
SHAP	SHapley Additive exPlanations
SVR	Support Vector Regression
XAI	Explainable Artificial Intelligence
ZCT	Zero-Crossing Threshold
DET	Derivative Extrema Threshold
IDR	Interaction Dominance Ratio

References

Fang, F.; Wang, J.; Jia, J.Y.; Wang, X.; Huang, P.C.; Yin, F.; Lin, J.J. Research progress on statistical forecast of crop meteorological yield in China. Arid. Zone Res. 2025, 42, 730–742. [Google Scholar]
Liu, W.F.; Bai, Y.; Du, T.; Li, M.; Yang, H.; Chen, S.; Liang, C.; Kang, S. Research progress on regional-scale crop growth and associated process models. Sci. China Earth Sci. 2025, 55, 669–685. (In Chinese) [Google Scholar]
Van Klompenburg, T.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Comput. Electron. Agric. 2020, 177, 105709. [Google Scholar] [CrossRef]
Wang, H.X.; Yu, Z.Z.; Li, H.L.; Wang, C.; Yan, X.L.; Zou, H.F. Yield prediction of fresh corn based on GA-BP neural network. Chin. J. Agric. Mech. 2024, 45, 156–162. [Google Scholar]
Ranasinghe, N.; Wijesinghe, K.D.; Dassanayake, K.B.; Herath, H.M.T. Interpretability and accessibility of machine learning in selected food processing, agriculture and health applications. J. Natl. Sci. Found. Sri Lanka 2022, 50, 263–276. [Google Scholar] [CrossRef]
Hu, T.; Zhang, X.; Bohrer, G.; Liu, Y.; Zhou, Y.; Martin, J.; Li, Y.; Zhao, K. Crop yield prediction via explainable AI and interpretable machine learning: Dangers of black box models for evaluating climate change impacts on crop yield. Agric. For. Meteorol. 2023, 336, 109458. [Google Scholar] [CrossRef]
Pai, D.G.; Balachandra, M.; Kamath, R. Explainable AI in agriculture: Review of applications, methodologies, and future directions. Eng. Res. Express 2025, 7, 032202. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Apley, D.W.; Zhu, J. Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. Ser. B 2020, 82, 1059–1086. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Peters, J.; Janzing, D.; Schölkopf, B. Elements of Causal Inference; MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
Cai, Y.; Gu, C.; Chen, T. An interpretable Machine Learning model for crop yield prediction based on ensemble learning and SHAP. Comput. Electron. Agric. 2021, 187, 106275. [Google Scholar]
Zhang, J.; Chen, C.; Pu, Z. An interpretable Machine Learning framework for winter wheat yield prediction. Remote Sens. 2021, 13, 4689. [Google Scholar]
Wang, P.X.; Wang, Y.; Tian, H.R.; Wang, J.; Liu, M.J.; Quan, W.T. Winter wheat yield estimation and interpretability research based on LightGBM. Trans. Chin. Soc. Agric. Mach. 2023, 54, 197–206, (In Chinese with English abstract). [Google Scholar]
Xia, C.; Ren, C.; Wang, Y.; Wang, Z.; Li, X.; Zhang, J.; Liu, Y.; Wang, H.; Zhao, Y.; Wang, Z. Decoding soil-topography buffering of maize yield spatial heterogeneity in extreme precipitation year using Sentinel-2 data and SHAP interpretability. Field Crops Res. 2026, 337, 110263. [Google Scholar] [CrossRef]
Qiao, B.; Yang, H.; Cao, X.; Zhou, B.; Wang, N. Driving mechanisms and threshold identification of landscape ecological risk: A nonlinear perspective from the Qilian Mountains, China. Ecol. Indic. 2025, 173, 113342. [Google Scholar] [CrossRef]
Wang, F.T. Theory and Method of Crop Meteorological Yield Forecast; Meteorological Press: Beijing, China, 1984. (In Chinese) [Google Scholar]
Luo, M.S.; Jing, Y.S.; Xiong, S.W. Rice meteorological yield forecast model based on genetic optimization BP neural network. Sci. Meteorol. Sin. 2012, 32, 665–670. [Google Scholar]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30, pp. 4765–4774. [Google Scholar]
Hou, L.P.; He, P.; Fan, X.S.; Xu, J.; Ren, Y.; Li, D.K. Review on methods for determining ecological thresholds. Chin. J. Appl. Ecol. 2021, 32, 711–718. [Google Scholar]
Zern, A.; Broelemann, K.; Kasneci, G. Interventional SHAP values and interaction values for piecewise linear regression trees. In Proceedings of the AAAI Conference on Artificial Intelligence; Association for the Advancement of Artificial Intelligence: Washington, DC, USA, 2023; Volume 37, pp. 11164–11173. [Google Scholar]
Qian, S.S.; King, R.S.; Richardson, C.J. Two statistical methods for the detection of environmental thresholds. Ecol. Model. 2003, 165, 13–23. [Google Scholar] [CrossRef]
Friedman, J.H.; Popescu, B.E. Predictive learning via rule ensembles. Ann. Appl. Stat. 2008, 2, 916–954. [Google Scholar] [CrossRef]
Sobol, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul. 2001, 55, 271–280. [Google Scholar] [CrossRef]
Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Lawrence Erlbaum Associates, Publishers: Hillsdale, NJ, USA, 1988. [Google Scholar]
Kotrlik, J.W.; Williams, H.A.; Jabor, M.K. Reporting and Interpreting Effect Size in Quantitative Agricultural Education Research. J. Agric. Educ. 2011, 52, 132–142. [Google Scholar] [CrossRef]
Weinerová, J.; Szűcs, D.; Ioannidis, J.P. A Published correlational effect sizes in social and developmental psychology. R. Soc. Open Sci. 2022, 9, 220311. [Google Scholar] [CrossRef] [PubMed]
Zhang, Q.-M.; Yi, Y.-H.; Liao, M.-T.; Guo, S.-L. Analysis on Safe Sowing Date of Double-cropping Early Rice with Different Seedling Raising Methods in Jiangxi under Climate Warming. Chin. J. Agrometeorol. 2022, 43, 893–901. [Google Scholar]
Wu, L.H.; Lou, W.P.; Yao, Y.P.; Mao, Y.D.; Su, G.L. High temperature damage index and impact assessment during heading and flowering stage of single-season rice in Zhejiang Province. Chin. J. Agrometeorol. 2009, 30, 582–586. (In Chinese) [Google Scholar]
Yao, Y.P.; Wu, L.H.; Jin, Z.F. Design of high temperature damage insurance index for single-season rice in Zhejiang Province based on disaster loss assessment. Sci. Agric. Sin. 2013, 46, 2342–2351. (In Chinese) [Google Scholar]
Jin, Z.F.; Yao, Y.P.; Li, R.Z. Effects of climate change on growth, development and yield of single-season rice in Zhejiang Province. Chin. J. Agrometeorol. 2015, 36, 58–66. (In Chinese) [Google Scholar]
Jagadish, S.V.K.; Craufurd, P.Q.; Wheeler, T.R. High temperature stress and spikelet fertility in rice (Oryza sativa L.). J. Exp. Bot. 2010, 61, 493–502. [Google Scholar] [CrossRef] [PubMed]
Zhao, C.; Liu, B.; Piao, S.; Wang, X.; Lobell, D.B.; Huang, Y.; Huang, M.; Yao, Y.; Bassu, S.; Ciais, P.; et al. Temperature increase reduces global yields of major crops in four independent estimates. Proc. Natl. Acad. Sci. USA 2017, 114, 9326–9331. [Google Scholar] [CrossRef] [PubMed]
Zhu, X.; Chen, J.; Xie, Z.; Feng, M.; Zhang, C. Interactive effects of heat stress and waterlogging on rice yield and quality. Front. Plant Sci. 2021, 12, 715941. [Google Scholar]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed.; Christoph Molnar: Munich, Germany, 2022; Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 1 May 2024).

Figure 1. The location of Ningbo within China and 9 weather stations in Ningbo.

Figure 2. HP–Filter Decomposition.

Figure 3. Correlation and Distribution.

Figure 4. Comparison of rice Meteorological Yield prediction performance among six machine learning models on the testing set.

Figure 5. Dynamic variations in R² and RMSE during the SHAP feature importance-based Recursive Feature Elimination process for the LightGBM model.

Figure 6. SHAP value distribution beeswarm plots of meteorological factors before and after feature optimization ((Left): 15 full features; (Right): 11 optimal features).

Figure 7. Contribution proportions of the mean absolute SHAP values for the 11 optimized core meteorological factors in the LightGBM model.

Figure 8. Distribution of the Zero–Crossing Thresholds (ZCT) and their 95% confidence intervals for the 11 meteorological factors within the SHAP dependence plots.

Figure 9. Distribution and direction indication of the Derivative Extrema Thresholds (DET) for the 11 meteorological factors within the SHAP dependence plots.

Figure 10. Decomposition of the proportions of pure main effects, pure interaction effects, and effect cancelation for each meteorological factor.

Figure 11. Heatmaps of the absolute interaction strength and the Interaction Dominance Ratio (IDR) for meteorological factor pairs. (a) Absolute Interaction Strength (Lower Triangle); (b) IDR: Interaction Dominance Ratio.

Figure 12. Two-dimensional classification paradigm space of meteorological factor pairs based on “absolute interaction amplitude—Interaction Dominance Ratio”.

Figure 13. Three-dimensional SHAP interaction effect surface plots for three typical interaction feature pairs ((a) strong interaction dominance—high amplitude; (b) main effect dominance—high amplitude; (c) low contribution feature pair).

Table 1. IDR Grading Standards.

IDR Range	Interaction Level	Description
≥0.50	Strong interaction dominance	The interaction variation span (P₉₀–P₁₀) reaches or exceeds half of the total effects discrete degree. At this point, the synergistic or antagonistic effect of the feature pair occupies an absolutely dominant position, rendering traditional single-factor main effect interpretations invalid.
0.30–0.50	Moderate interaction dominance	The interaction variation span accounts for 30% to 50% of the total effect of the discrete degree. A substantive interaction mechanism exists between the feature pairs, but it does not completely mask their independent pure main effect contributions.
<0.30	Weak interaction dominance	The interaction variation span is less than one-third of the total effect discrete degree. The behavior of the feature pair is primarily driven by their respective independent main effects, and the interaction can be regarded as a secondary perturbation or statistical noise.

Table 2. Descriptive statistics of rice yield variables and the 11 optimized meteorological features (N = 270).

	Unit	Mean	Std	Min	Q1	Median	Q3	Max	CV (%)
TAVG_3	°C	11.476	1.02	8.517	10.77	11.505	12.103	14.5	8.889
TAVG_3-5	°C	16.489	0.815	14.748	15.901	16.435	17.056	19	4.944
TAVG_6	°C	24.503	0.856	22.155	23.978	24.547	25.044	26.886	3.494
TAVG_6-7	°C	26.601	0.759	24.5	26.072	26.589	27.141	28.407	2.852
TAVG_8-9	°C	26.237	0.668	24.686	25.765	26.154	26.715	28	2.546
TMIN_6	°C	21.742	0.806	19.5	21.201	21.734	22.304	23.5	3.706
R2020_3-6	mm	454.413	74.906	280	401.681	452.713	510.446	650	16.484
R2020_4-7	mm	539.718	94.305	330	480.756	539.545	602.205	800	17.473
R2020_5	mm	110.522	38.453	30	83.537	110.668	137.309	210	34.792
R2020_8	mm	213.546	77.022	50	155.46	214.518	264.115	448.821	36.068
R2020_10	mm	81.318	37.731	15	56.341	77.237	105.51	190	46.399
Met_Yield	t·hm⁻²	0	0.5	−2.118	−0.194	0.078	0.26	1.487	140.302
Actual Yield	t·hm⁻²	6.768	0.735	4.08	6.383	6.811	7.293	8.861	10.864
Trend Yield	t·hm⁻²	6.768	0.498	5.666	6.389	6.809	7.11	8.215	7.354

Note: CV for Meteorological Yield is omitted as its mean approaches zero; the standard deviation alone characterizes its dispersion.

Table 3. Pearson correlation coefficients between the 11 meteorological features and the meteorological yield (N = 270).

Feature	Pearson r	p-Value	Significance
R2020_4-7	0.047	0.442874406	ns
TMIN_6	0.009	0.888561883	ns
TAVG_8-9	0.004	0.945626161	ns
TAVG_3	−0.008	0.890635898	ns
R2020_3-6	−0.023	0.702097451	ns
TAVG_3-5	−0.027	0.663850397	ns
TAVG_6-7	−0.032	0.603479772	ns
R2020_10	−0.032	0.602347436	ns
R2020_5	−0.038	0.536809297	ns
R2020_8	−0.074	0.227442084	ns
TAVG_6	−0.074	0.223541597	ns

Table 4. Zero-Crossing Thresholds and Intervals for the 11 Features Based on SHAP.

Feature Factor	ZCT	95% Confidence Interval	Unit
R2020_10	83.3	[81.5, 85.7]	mm
R2020_3-6	451.3	[447.4, 452.7]	mm
R2020_4-7	542.4	[523.5, 549.5]	mm
R2020_5	113.1	[95.7, 116.8]	mm
R2020_8	210.6	[205.8, 215.9]	mm
TAVG_3-5	16.8	[16.3, 16.8]	°C
TAVG_3	11.6	[11.5, 11.7]	°C
TAVG_6-7	26.1	[25.5, 26.2]	°C
TAVG_6	24.5	[24.5, 24.6]	°C
TAVG_8-9	26.2	[26.2, 26.2]	°C
TMIN_6	21.7	[21.6, 21.7]	°C

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, C.; Yan, Z.; Miao, S. Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach. Atmosphere 2026, 17, 599. https://doi.org/10.3390/atmos17060599

AMA Style

Lin C, Yan Z, Miao S. Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach. Atmosphere. 2026; 17(6):599. https://doi.org/10.3390/atmos17060599

Chicago/Turabian Style

Lin, Chenshuang, Zhitao Yan, and Shujie Miao. 2026. "Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach" Atmosphere 17, no. 6: 599. https://doi.org/10.3390/atmos17060599

APA Style

Lin, C., Yan, Z., & Miao, S. (2026). Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach. Atmosphere, 17(6), 599. https://doi.org/10.3390/atmos17060599

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identifying Nonlinear Thresholds and Interaction Dominance of Meteorological Drivers on Rice Yield: A SHAP-Based Approach

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources

2.2. Data Processing

2.3. Model Construction

2.4. Nonlinear Threshold Identification Based on SHAP

2.5. Factor Interaction Effect Analysis Based on SHAP

2.6. Model Evaluation Metrics

3. Results

3.1. Initial Data Analysis

3.2. Comparison of Predictive Performance Across Different Yield Models

3.3. Model Optimization and Meteorological Factor Contributions

3.4. Meteorological Threshold Identification for Rice Yield

3.4.1. ZCT Identification Results

3.4.2. DET Identification Results

3.5. Factor Interaction Effect Analysis

4. Discussion

4.1. Our Findings Based on SHAP and ZCT

4.2. What Role Did DET Play as a Complement to ZCT?

4.3. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI