Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling

Zhang, Zhenzi; Gan, Miao; Li, Na; Dong, Jun; Liu, Yang; Hou, Zhiyan; Yue, Xingyu; Dong, Zhi

doi:10.3390/agronomy16050584

Open AccessArticle

Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling

by

Zhenzi Zhang

^1,2

,

Miao Gan

¹,

Na Li

³,

Jun Dong

^1,2,

Yang Liu

^1,2,

Zhiyan Hou

^1,2,

Xingyu Yue

⁴ and

Zhi Dong

^1,2,*

¹

Liaoning Academy of Agricultural Science, Shenyang 110161, China

²

Liaoning Key Laboratory of Conservation Tillage in Dry Land, Shenyang 110161, China

³

College of Land and Environment, Shenyang Agricultural University, Shenyang 110866, China

⁴

Fuxin Modern Agricultural Development Service Center, Fuxin 123000, China

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(5), 584; https://doi.org/10.3390/agronomy16050584

Submission received: 2 February 2026 / Revised: 21 February 2026 / Accepted: 23 February 2026 / Published: 8 March 2026

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Tillage–residue management is a controllable lever for improving maize yield and system resilience under climate variability. Here we propose a mixed-effects spatiotemporal learning framework (ME-LSTM) that integrates multi-source observations to enable robust yield prediction and management system evaluation across heterogeneous sites and years. First, we construct multi-year sliding-window inputs to represent legacy effects and cumulative influences of past management and environment. Second, a deep temporal encoder learns nonlinear dependencies from climate–soil–remote-sensing sequences to enhance interannual extrapolation. Third, a mixed-effects module explicitly separates management fixed effects from hierarchical random effects (e.g., source/study, site, year, and plot), absorbing source-specific biases and unobserved heterogeneity while improving interpretability. Finally, we parameterize management × climate/soil interactions to quantify system-specific sensitivities to environmental drivers and to support scenario-based comparison and recommendation of management options. Across multi-ecological maize datasets, ME-LSTM achieved an R² of 0.8989 with an RMSE of 309.83 kg ha⁻¹ on the test set. Ablation analyses show that removing remote-sensing features or ground-based temporal information substantially degrades performance, confirming the complementary value of multi-source fusion. Benchmarking against strong temporal baselines (LSTM, GRU, BiGRU, and Transformer) further demonstrates consistent accuracy gains of ME-LSTM, highlighting its suitability for small-sample, noisy, and hierarchically structured agricultural data. Overall, ME-LSTM provides an interpretable and scalable tool for climate-adaptive optimization of tillage–residue management and supports robust, actionable decision-making across diverse agro-ecological conditions.

Keywords:

conservation tillage; residue management; maize yield; multi-source data fusion; mixed-effects modeling; LSTM

1. Introduction

Food security is a cornerstone of national security, and accurate yield forecasting is essential for safeguarding food supply, stabilizing markets, and supporting agricultural policy-making [1]. However, the intensifying climate crisis and the increasing frequency of extreme weather events have made crop yield responses to meteorological variability highly stochastic and nonlinear [2,3]. Within cropping systems, management practices represent one of the few anthropogenic drivers capable of mitigating these climatic risks. In maize-based systems, no-till with residue mulch (NTS) and plow tillage with residue incorporation (PTS) are widely implemented strategies that can alter yield outcomes by regulating soil moisture buffering and nutrient supply. Evidence from syntheses and meta-analyses indicates that residue return and reduced tillage influence both soil organic carbon (SOC) and crop yields. However, the magnitude and direction of these effects vary substantially across agro-ecological contexts [4]. Therefore, yield models must go beyond static predictions; they must be capable of distinguishing between tillage–residue management systems and explicitly quantifying their dynamic interactions with environmental conditions to support actionable, climate-smart decision-making [5,6]. Positioning such models within the broader agronomic and ecological literature is particularly important because management effects are often context-dependent and may shift under changing precipitation regimes and climate extremes.

A broad range of approaches has been explored for crop yield prediction. Early work relied on time-series and classical statistical models (e.g., ARIMA and multiple regression), which are computationally efficient but often insufficient for capturing nonlinear and nonstationary yield responses [7,8]. With advances in machine learning, methods such as random forests, support vector machines, and neural networks have been increasingly adopted [9,10]. Shook et al. [11], for instance, developed an attention-enhanced LSTM model integrating pedigree information and meteorological time series, improving both accuracy and interpretability for key phenological stages. Beyond closely related studies, sequence models such as LSTM/Bi-LSTM and CNN–RNN hybrids have been widely applied in agriculture to learn temporal dependencies from weather and remote-sensing time series for yield estimation and related decision-support tasks [12,13,14]. To better represent spatiotemporal crop dynamics, remote sensing has been incorporated into yield forecasting pipelines. Zhuo et al. [12] assimilated MODIS LAI into the WOFOST model using four-dimensional variational methods and parameter optimization, improving regional winter wheat yield prediction and reducing uncertainty. Dhakar et al. [13] integrated crop modeling, remote sensing, and weather forecasts via LAI assimilation and an ensemble Kalman filter, enhancing predictions of wheat phenology and yield while reducing reliance on intensive field management inputs. More recently, transformer-based architectures have been introduced to better capture long-range temporal dependencies in remotely sensed time series, showing improved yield estimation performance in satellite time series [14], and transfer-learning strategies have been explored to address data scarcity [15]. Notably, recent work has emphasized multimodal learning and transfer across regions (e.g., deep transfer learning for yield prediction with remote sensing [16]) and attention/Transformer-based multimodal data fusion for yield forecasting [17], reflecting rapid progress in this fast-moving field.

Despite these methodological advances, critical modeling gaps remain, particularly the mismatch between common deep-learning training/evaluation practices that implicitly treat samples as independent and the hierarchical, correlated structure of agricultural data (e.g., observations nested within studies, sites, years, and plots). Most existing deep learning studies treat management as a static background factor rather than an explicit decision variable. Furthermore, they often fail to decouple the “fixed effects” of management practices from the “random effects” arising from site-specific soil heterogeneity and interannual climate variability [18,19]. Foundational and widely cited references in ecology and applied statistics have established best practices for mixed-effects modeling, including guidance on specifying random effects, avoiding common pitfalls, and interpreting fixed vs. random components [20,21]. Crucially, when yield observations are nested within plots/fields and years, standard LSTM or Transformer models lack an explicit mechanism to disentangle whether an observed yield difference is driven by management or by unobserved site- or year-level effects (e.g., soil heterogeneity and year-specific weather shocks). As a result, these models may inadvertently attribute site-specific noise or year-specific anomalies to a management label, leading to confounding and limited transferability across locations and years. Standard LSTM or Transformer models risk overfitting to site-specific noise rather than learning the generalized response of yield to management–climate interactions. Although meta-analyses suggest that residue return increases water-use efficiency on average, the magnitude and direction of these benefits are strictly context-dependent [22,23]. As a result, a central challenge persists: how to integrate multi-source observations into a unified framework that not only captures nonlinear temporal dynamics but also statistically accounts for hierarchical heterogeneity, thereby enabling scenario-specific recommendations under variable climates. This challenge motivates the use of a mixed-effects formulation that can separate management fixed effects from hierarchical random effects while explicitly modeling management–environment interactions. Recent discussions on the reliability of mixed-effects specifications (e.g., treating grouping factors as fixed vs. random) further highlight the need for transparent hierarchical modeling choices when integrating heterogeneous datasets [24].

To address these gaps, we propose a mixed-effects spatiotemporal learning framework (ME-LSTM) for dynamic evaluation of tillage–residue management systems and maize yield prediction. Using a pooled dataset consisting of long-term multi-site experiments and harmonized literature records, we evaluate three representative systems (RT, NTS, and PTS). The core originality is an end-to-end coupling of a temporal deep encoder with a hierarchical mixed-effects layer (study/source–site–year–plot), which explicitly separates management fixed effects from multi-source bias and unobserved heterogeneity. This design enables the model to learn multi-year legacy effects while retaining statistical interpretability and transferability across regions and years. Specifically, ME-LSTM (1) captures legacy effects via multi-year sliding-window reconstruction; (2) learns nonlinear temporal dependencies from climate–soil–remote sensing sequences; (3) quantifies system-specific sensitivities through explicit management × climate/soil interaction terms; and (4) outputs scenario-dependent, interpretable contrasts among RT/NTS/PTS that support climate-adaptive recommendations rather than static average comparisons. By embedding a widely adopted hierarchical modeling principle into modern sequence learning, the framework is positioned within both the ecological mixed-effects tradition and the recent wave of multimodal deep learning for agricultural forecasting. Because the hierarchical structure is explicit, the framework is readily extendable to additional management options, other crops, and emerging data streams (e.g., higher-frequency satellite products), providing a general blueprint for future climate-smart decision support.

2. Materials and Methods

2.1. Study Sites and Data Sources

Data used in this study were obtained from two sources: (i) a long-term field experiment conducted in Fuxin, Liaoning Province, China (42.13° N, 121.74° E) during 2017–2025; and (ii) a pooled literature dataset compiled via meta-analytic extraction from 54 eligible studies covering 30 cities with observations spanning 2006–2024.

We evaluated three representative tillage–residue management systems: conventional rotary tillage with straw removal (RT), no-till with straw mulch/retained residue (NTS), and plow tillage with straw incorporation (PTS). The pooled sites span major maize-producing provinces/regions across China (e.g., Liaoning, Jilin, Heilongjiang, Shaanxi, Gansu, Inner Mongolia, Shandong, Henan, Shanxi, and Hebei), covering the main maize agro-ecological types; where reported, original trials followed a randomized complete block design with three replicates (Figure 1).

After harmonization, quality control, and sliding-window reconstruction, the final supervised-learning dataset comprised 837 samples from 31 sites, covering nine years and three management systems (RT/NTS/PTS). Specifically, 81 samples were derived from our Fuxin field experiment and 756 samples were extracted from the literature dataset. To enable three-year window reconstruction, the literature records were retained only when each city (or study site) provided at least three consecutive years of maize cultivation with the required outcomes and covariates, allowing for the construction of consistent three-year windows. Across both sources, grain yield was standardized to kg ha⁻¹, meteorological variables were summarized over the maize growing season for each year, and soil variables were aligned to the 0–20 cm layer whenever feasible. Records were excluded if essential metadata (site/year/management description) were missing, if treatments could not be unambiguously mapped to RT/NTS/PTS, or if cross-source harmonization was not possible.

For model development and evaluation, we used a strictly time-ordered split to prevent temporal leakage. After all three-year windows were constructed, samples were ordered by the window-ending year (i.e., the year associated with the target yield) and then split into training/validation/test sets with an 8:1:1 ratio. The earliest 80% of samples were used for model fitting (training set), the subsequent 10% were used for hyperparameter tuning and early stopping (validation set), and the most recent 10% were reserved as an independent test set for final performance reporting. This protocol ensures that no future-year information is used during training or model selection.

Uncertainty quantification and prediction intervals. To quantify predictive uncertainty, we adopted an ensemble-based approach. Specifically, we trained ME-LSTM models using the same training protocol but different random seeds and bootstrap resamples of the training set. For each input sample, we obtained an empirical distribution of predictions

\{\hat{y} (b)\}_\{b = 1 . B\}

. The 95% prediction interval (PI) was computed as the 2.5th and 97.5th percentiles of this distribution, and the PI width was summarized to characterize uncertainty across years and management systems. This procedure captures both data-driven variability and model instability under limited samples.

Multi-source fusion strategy and scale harmonization. Because field observations and literature-derived records differ in spatial/temporal granularity and measurement protocols (i.e., “source-specific scale effects”), we integrated the two sources following a harmonization-first strategy: (1) variables were standardized to consistent units and definitions; (2) observations were aligned to the maize growing season; and (3) hierarchical random effects were introduced in the modeling stage to absorb source-specific biases and unobserved heterogeneity. Specifically, random intercepts for study/source (literature) and site/plot (field trials) allow the model to account for systematic offsets caused by differences in measurement methods, management reporting detail, and background conditions, thereby improving cross-source comparability and generalization.

Soil samples were collected after maize harvest from the 0–20 cm plow layer using an S-shaped sampling pattern. Soil organic carbon (SOC) was determined using the potassium dichromate oxidation method with external heating, and soil organic matter (SOM) was calculated from SOC using a conversion factor of 1.724. Meteorological data were obtained from automatic weather stations at each trial site, which recorded daily precipitation and air temperature throughout the maize growing season. Yield data were collected using the harvest method (plot-by-plot harvesting); grain weight was measured accurately and converted to yield per unit area.

Remote sensing data were obtained from multispectral Landsat Surface Reflectance products provided by the U.S. Geological Survey (USGS, Reston, VA, USA), including Landsat 5 TM, Landsat 7 ETM+, and Landsat 8 OLI, and were matched to each site–year during the maize growing season (2006–2025, subject to sensor availability and cloud-free observations). All images were processed as surface reflectance (radiometric calibration and atmospheric correction as provided in the product) and quality-filtered to remove clouds and cloud shadows using the accompanying QA mask. To ensure cross-sensor consistency, vegetation indices were calculated using the NIR, red, and blue reflectance bands corresponding to each sensor. Four physiologically meaningful indices (NDVI, SAVI, MSAVI, and EVI) were computed and extracted at key maize growth stages (jointing, tasseling, and grain filling), thereby forming a stage-wise time-series feature set that captures canopy dynamics. The index definitions are listed in Table 1.

Descriptive statistics of major soil and climatic variables under different tillage–residue management systems are summarized in Table 2. Overall, SOC (26.32 ± 2.54 g kg⁻¹) and SOM (45.38 ± 4.38 g kg⁻¹) under NTS were significantly higher than those under PTS and RT, and their coefficients of variation were relatively lower, indicating that NTS performed better in maintaining soil fertility stability. Mean temperature was generally comparable across systems, whereas precipitation exhibited substantial interannual variability (CV = 72.6%), highlighting pronounced climatic uncertainty and underscoring the need for adaptive, climate-resilient dynamic management research.

2.2. Sliding-Window Temporal Reconstruction and Dataset Construction

The effects of tillage–residue management systems exhibit pronounced cumulative and lagged characteristics. Yield in the current year is influenced not only by management practices in that year, but also by the history of tillage–residue management and environmental conditions in preceding years. To effectively capture such temporal dependence, we employed a sliding-window-based temporal reconstruction approach to integrate multi-year observations into supervised learning samples, thereby extracting dynamic patterns underlying yield formation. This reconstruction was applied to the pooled multi-source dataset spanning 31 sites across China, nine years of observations, and three management systems (RT, NTS, and PTS).

Specifically, a window length of three years was used. Consecutive three-year observations were reconstructed into one supervised sample to predict maize yield for the final year of the window. Within each window, three types of derived temporal features were extracted for meteorological and soil variables: (1) the three-year mean, representing the average level over the window period; (2) the three-year standard deviation, characterizing interannual variability; and (3) the temporal trend (end value minus start value), describing the direction of change within the window. Through this reconstruction, the original time series was transformed into a supervised learning format, yielding 837 samples. Because sites differ in record length and completeness, the number of reconstructed three-year windows contributed by each site is not necessarily equal. Each sample contained 42 features that jointly represent soil dynamics, climatic conditions, spectral information, and management descriptors. The original variable set within each three-year window includes two soil properties (soil organic matter and soil organic carbon), seven growing-season meteorological variables (rainfall frequency, mean precipitation, mean air temperature, maximum precipitation, minimum precipitation, maximum air temperature, and minimum air temperature), four satellite-derived vegetation indices (NDVI, SAVI, MSAVI, and EVI), and three management indicators (RT, NTS, and PTS). For continuous variables (soil, meteorological, and vegetation-index predictors), we further derived window-level summary features (_mean, _std, and _diff) to encode average state, interannual variability, and directional change, which together constitute the aggregated feature representation used in modeling. To avoid confusion about feature dimensionality, we define two complementary inputs: (i) a year-level time-step feature vector used as sequential input to the temporal network (LSTM branch), and (ii) a window-level aggregated feature vector (mean, standard deviation, and trend across the three-year window) used as complementary input to the mixed-effects/statistical component. This design preserves within-window temporal dependence while capturing cumulative and variability signals. By combining the sequence input and the aggregated input, the model learns both short-term dynamics and multi-year legacy effects without information leakage. Together with the strictly time-ordered split described in Section 2.1, this reconstruction ensures that no information from future years is used during model training or model selection, thereby preventing temporal leakage.

2.3. Architecture of the Mixed-Effects LSTM Model (ME-LSTM)

Crop yield formation is simultaneously influenced by continuously varying temporal variables (e.g., interannual climate and soil dynamics) and hierarchical/group-structured factors (e.g., management system, year, plot, and study/source identifiers in pooled datasets). To jointly capture nonlinear temporal dependence and hierarchical heterogeneity, we designed a hierarchical integrated modeling framework, termed the mixed-effects LSTM model (ME-LSTM). In ME-LSTM, high-level temporal features extracted by the temporal network are fused with a mixed-effects component consisting of management fixed effects, random effects for hierarchical identifiers, and explicit management–environment interaction terms. The overall model architecture is illustrated in Figure 2.

The overall model is an end-to-end neural network, whose forward propagation and final output can be interpreted as follows:

Authorship and sources of Equations (1)–(10): Equations (1)–(5) describe the Long Short-Term Memory (LSTM) unit and follow the standard formulation originally proposed by Hochreiter and Schmidhuber (1997) [29]. Equations (6) and (7) implement an additive attention mechanism applied to the LSTM hidden states, following the alignment model of Bahdanau et al. (2015) [30]. Equation (8) corresponds to the linear mixed-effects predictor and adopts the conventional mixed-effects modeling structure widely used in hierarchical agricultural datasets. Equations (9) and (10) specify the end-to-end coupling strategy that fuses the deep temporal predictor with the mixed-effects component and defines the final yield estimator; these coupling equations are proposed in this study to enable management fixed effects to be separated from hierarchical random effects within a unified learning framework.

{\hat{y}}_{i, t + 1} = z_{i}^{T} α + x_{t}^{T} β + {(z_{i} ⊙ e_{t})}^{T} γ + f_{LSTM} (S_{i, [t - 2 : t]}) + u_{s t u d y} + u_{y e a r} + u_{p l o t}

(1)

where

{\hat{y}}_{i, t + 1}

denotes the predicted yield for the

i

tillage–residue management system in year

t + 1

.

z_{i}

is the fixed-effect design vector for the tillage–residue management system

i

.

x_{t}

is the environmental-factor vector in year

t

.

e_{t}

is the environmental-factor vector in year

t

used to construct interaction terms.

α, β, γ

are learnable parameters corresponding to the main effects of the management system, the main effects of environmental factors, and the interaction effects, respectively.

S_{i, [t - 2 : t]}

is the sliding-window feature sequence fed into the model, containing all observed features for management system

i

over three consecutive years

t - 2, t - 1, t

.

f_{LSTM} (S_{i, [t - 2 : t]})

denotes the LSTM network, which maps a length-3 sequence

S

to a scalar representing the yield component determined by historical temporal patterns.

u_{s t u d y} \sim N (0, σ s 2)

,

u_{y e a r} \sim N (0, σ y 2)

and

u_{p l o t} \sim N (0, σ p 2)

are random-intercept effects for study/source, year, and plot replicate, respectively, all assumed to follow normal distributions. Here,

u_{s t u d y}

captures source-specific/study-specific systematic offsets (e.g., differences in measurement protocols and unobserved study-level bias) when pooling multi-source data,

u_{y e a r}

captures interannual variability shared within a given year, and

u_{p l o t}

represents within-site replicate variability. For literature records where plot identifiers are unavailable, the

u_{p l o t}

term is omitted, while

u_{s t u d y}

and

u_{y e a r}

remain included. Thus, the random-effects hierarchy includes study/source and year effects for all observations, with an additional plot-level effect when replicate information is available.

We build ME-LSTM as an end-to-end network that couples a temporal encoder with a mixed-effects predictor. The temporal network captures long-term dependencies and legacy effects of historical climatic and soil conditions on yield, while the mixed-effects component decomposes yield variability into fixed effects (management system and environmental covariates) and hierarchical random effects (e.g., study/source, site, year, and plot), and accommodates management–environment interaction terms for scenario-based evaluation and comparison of tillage–residue management systems.

For temporal feature extraction, a long short-term memory (LSTM) network is used to process the three-year sequence. Let the input feature matrix be

X_{t - 2 : t}

, which contains observations from three consecutive years. At each time step, the model ingests a year-level feature vector that includes the encoded management system together with the corresponding year-level covariates (soil, climate, and remote-sensing summaries). In addition, window-level aggregated features (e.g., mean, standard deviation, and trend over the three-year window) are constructed and used as complementary inputs in the mixed-effects component. Through gated mechanisms, the LSTM learns long-range dependencies, and the core update equations are:

\begin{array}{l} i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + b_{i}) \\ f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + b_{f}) \\ o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + b_{o}) \\ {\tilde{c}}_{t} = \tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c}) \\ c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t} \\ h_{t} = o_{t} ⊙ \tanh (c_{t}) \end{array}

(2)

where

i_{t}, f_{t}, o_{t}

denote the input gate, forget gate, and output gate, respectively;

c_{t}

is the cell state;

h_{t}

is the hidden state;

σ

is the sigmoid function; and

⊙

denotes element-wise multiplication. The matrices

W_{x^{*}}

and

W_{h^{*}}

are learnable weight parameters that map the input and previous hidden state to each gate, and

b_{*}

are the corresponding bias vectors.

To enhance the model’s ability to focus on informative historical time steps, we introduce a temporal (additive) attention module after the LSTM layer. This mechanism assigns importance weights to the sequence of LSTM hidden states within each three-year window. The resulting attention-pooled representation

h_{a t t n}

is computed as a weighted sum of hidden states (Equation (4)), allowing the model to emphasize the most informative temporal signals while suppressing less relevant ones.

\begin{array}{l} e_{t^{'}} = u^{T} \tanh (W_{h} h_{t^{'}} + b_{h}) \\ α_{t^{'}} = \frac{\exp (e_{t^{'}})}{\sum_{t^{″} = t - 2}^{t} \exp (e_{t^{″}})} \end{array}

(3)

where

t^{'}

indexes a time step within the sliding window,

h_{t^{'}}

is the corresponding LSTM hidden state, and

α_{t^{'}}

is the normalized attention weight. The final temporal representation is computed as the weighted sum of hidden states:

h_{a t t n} = \sum_{t^{'} = t - 2}^{t} α_{t^{'}} h_{t^{'}}

(4)

For mixed-effects modeling, we construct a hierarchical statistical model comprising both fixed and random effects. The fixed-effects component captures the main effects of tillage–residue management systems and environmental predictors, as well as their interaction effects.

μ_{fixed} = β_{0} + β_{1} \cdot Tillage + \sum_{k = 1}^{p} β_{2 k \cdot Env k}

(5)

where

Envk

denotes the

k

environmental factor (e.g., SOC, precipitation, temperature), and

β

is the corresponding parameter to be estimated. The random-effects part includes year effects and experimental replicate effects, assumed to follow normal distributions:

u_{Year} \sim N (0, {σ^{2}}_{Year}), u_{Plot} \sim N (0, {σ^{2}}_{Plot})

(6)

We also include interaction terms between the tillage–residue management system and environmental factors to quantify system-specific differences in responses to key drivers:

μ_{interaction} = \sum_{j = 1}^{3} \sum_{k = 1}^{p} γ_{j k} \cdot I (Tillage = j) \cdot {Env}_{k}

(7)

where

I (Tillage = j)

is an indicator function and

γ_{j k}

is the interaction coefficient.

Finally, we fuse the deep temporal prediction and the mixed-effects prediction using a learnable mixing weight:

\hat{Y} = α {\hat{y}}_{-} D L + (1 - α) {\hat{y}}_{-} M E

(8)

where

α

is constrained to (0,1) via a sigmoid mapping to ensure it acts as a valid blending weight (i.e.,

α = σ (w_{-} α^{T} z + b_{-} α)

).

The training objective is to minimize prediction error while constraining the distribution of random effects. Therefore, the loss function consists of two components:

ς = \frac{1}{N} \sum_{n = 1}^{N} {(y_{n} - {\hat{y}}_{n})}^{2} + λ_{1} \sum_{k} ({u^{2}}_{year, k} / {σ^{2}}_{y}) + λ_{2} \sum_{j} ({u^{2}}_{plot, j} / {σ^{2}}_{p})

(9)

During model training, the random effects

u

are not estimated as explicit parameters. Instead, we adopt an embedding-with-regularization approximation: identifiers for year and plot are encoded as learnable embedding vectors, and an L2 penalty is applied to the embedding weights. This formulation is conceptually equivalent to assuming zero-mean Gaussian random effects, whose variance is implicitly controlled by the regularization strength

u

. To select

λ

, we performed a grid search over

\{0.0001, 0.005, 0.001, 0.01, 0.05, 0.1, 0.5, 1.0\}

using 5-fold cross-validation, with the validation negative log-likelihood (NLL) as the selection criterion. The optimal value was

λ = 0.01

. By tuning the shrinkage imposed on the random-effect embeddings, this strategy mitigates overfitting while preserving between-group heterogeneity, thereby improving generalization and maintaining coherent interpretability. Accordingly, the final loss function is simplified as:

ς = MSE (y, \hat{y}) + λ_{year} ‖ E_{year} ‖ \frac{2}{2} + λ_{plot} ‖ E_{plot} ‖ \frac{2}{2}

(10)

where

E_{year}

and

E_{plot}

are the embedding matrices for year and plot, respectively, and

λ

is a hyperparameter controlling the variance of random effects, determined via cross-validation.

Model training uses the Adam optimizer with an initial learning rate of 0.001, decayed to 90% of its previous value every 50 epochs. To mitigate overfitting, dropout is applied before the fully connected layer, and an additional L2 regularization term is included in the loss function. With this fused architecture, the model can capture complex spatiotemporal nonlinear relationships while retaining statistical interpretability for fixed and random effects, providing a modeling framework that balances predictive accuracy and mechanistic insight for dynamic, adaptive recommendations.

2.4. Model Evaluation and Validation

Model performance was evaluated using a strictly time-ordered split to prevent temporal leakage, as described in Section 2.1. After all three-year windows were constructed, samples were sorted by the window-ending year (i.e., the year associated with the target yield) and then split into training/validation/test subsets with an 8:1:1 ratio. The training subset was used for model fitting, the validation subset for hyperparameter tuning and early stopping, and the test subset was held out and used only once for final performance reporting.

Performance was assessed using R², RMSE, MAE, MAPE, and NSE. To evaluate robustness across management scenarios, metrics were also reported separately for RT, NTS, and PTS. We benchmarked ME-LSTM against four temporal baselines (LSTM, GRU, BiGRU, and Transformer) using the same split, identical input features, and a consistent training protocol. Hyperparameters for all models were tuned using 5-fold cross-validation within the training subset, and the selected configurations were evaluated on the independent test subset.

For climate-stratified comparisons, growing-season precipitation was grouped into dry (<300 mm), normal (300–600 mm), and wet (>600 mm). Linear mixed-effects models were fitted with management system, precipitation group, and their interaction as fixed effects, with random intercepts for study/source and site (and for year and plot when available). Estimated marginal means and Tukey-adjusted post hoc comparisons were used to infer between-system differences (p < 0.05).

2.5. Visual Decision Support Prototype

Based on the PyQt5 framework (version 5.15.9; https://pypi.org/project/PyQt5/, accessed on 22 February 2026), we developed an interactive visual decision support prototype (Figure 3) that integrates the trained model for real-time inference. Users input historical soil/climate information, and the system outputs predicted yields under RT, NTS, and PTS, along with a scenario-based recommendation that selects the management option with the highest predicted yield under the provided conditions.

3. Results and Analysis

3.1. Yield Performance Under Different Tillage–Residue Management Systems

Yield differences among tillage–residue management systems are shown in Figure 4. Results from the long-term experiment indicate significant yield differences among the three systems (linear mixed-effects model with management system as a fixed effect and year and plot as random effects; likelihood ratio test comparing nested models with and without the system term, p < 0.001). Among the systems, NTS achieved the highest mean yield (11,788.1 kg ha⁻¹), significantly exceeding PTS (10,558.7 kg ha⁻¹) and RT (10,595.1 kg ha⁻¹), with increases of 11.6% and 11.3%, respectively. Yield stability also differed across systems: NTS exhibited the lowest interannual coefficient of variation (CV = 9.9%), indicating smaller year-to-year fluctuations and greater stability, whereas PTS showed the highest CV (14.6%), suggesting higher sensitivity to interannual climatic variability. These differences may be attributable to system-specific effects on soil water retention and the stability of nutrient supply.

Temporal dynamics are presented in Figure 5. Although maize yields increased over time under all three tillage–residue management systems, their growth patterns differed. Yield under NTS increased relatively steadily, rising from 11,547.6 kg ha⁻¹ in 2017 to 14,200.0 kg ha⁻¹ in 2025, with an average annual growth rate of approximately 2.8%. In contrast, PTS exhibited larger year-to-year fluctuations, with pronounced declines in 2018 and 2023, indicating weaker resilience under unfavorable climatic conditions.

Overall, these results demonstrate that tillage–residue management systems influence not only the mean yield level of maize but also the stability of yield responses to climatic variability, providing an important basis for subsequent dynamic, adaptive management recommendations.

3.2. Evaluation of Model Predictive Performance

The trained ME-LSTM exhibited strong predictive performance under the 8:1:1 time-ordered split. Figure 6 summarizes the training and validation diagnostics. As shown in Figure 6a, both the training and validation losses decrease steadily with increasing epochs and gradually converge to stable values, indicating effective optimization without evident overfitting. The prediction error distribution on the validation set (Figure 6b) is centered near zero and approximately symmetric, suggesting limited systematic bias; the mean error is −22.6 kg ha⁻¹ and the standard deviation is 308.9 kg ha⁻¹ (the red dashed line denotes zero error). Figure 6c shows the predicted-versus-observed (actual) scatter plot for 50 randomly selected validation samples, where most points lie close to the 1:1 reference line (R² = 0.899). The corresponding residual plot (Figure 6d) shows residuals distributed around zero without a clear systematic trend across the prediction range; most residuals fall within approximately ±600 kg ha⁻¹, indicating an uncertainty level that may be acceptable for field-level decision support depending on the operational risk tolerance.

Final predictive performance was assessed on the independent test set (Table 3). ME-LSTM achieved an R² of 0.8989, an RMSE of 309.83 kg ha⁻¹, an MAE of 245.18 kg ha⁻¹, and a MAPE of 2.14%, indicating strong predictive accuracy on unseen data. Compared with four deep temporal baselines (LSTM, GRU, BiGRU, and Transformer) trained under the same split and with the same input features, ME-LSTM achieved the best overall performance. Among the baselines, LSTM performed best on this dataset (R² = 0.7006; RMSE = 741.36 kg ha⁻¹). Relative to this best-performing baseline, ME-LSTM reduced RMSE by 58.2% and increased R² from 0.7006 to 0.8989; relative to the Transformer baseline (R² = 0.6004; RMSE = 1100.14 kg ha⁻¹), ME-LSTM reduced RMSE by 71.8% and increased R² from 0.6004 to 0.8989, with consistent reductions in MAE and MAPE. Ablation results further show that removing remote-sensing features or ground-based temporal information substantially degrades performance, highlighting the complementary value of multi-source observations for robust yield prediction.

For further analysis, the point predictions and associated prediction intervals are shown in Figure 7a and b, respectively, providing additional insight into predictive uncertainty. The curves depict the temporal trajectory of predicted yield, together with the corresponding 95% prediction intervals (PIs). These intervals were derived from the empirical quantiles of an ensemble of ME-LSTM models trained with different seeds and bootstrap resamples (see the uncertainty quantification procedure described above). In years with relatively stable climatic conditions, the PI bands are narrower, indicating lower predictive uncertainty; in years with larger climatic fluctuations, the PI bands widen accordingly, suggesting increased uncertainty captured by the model. This uncertainty information is valuable for agricultural risk management, as decision-makers can use both point forecasts and plausible ranges of yield outcomes to develop more resilient management strategies. The PI trends are consistent with the point predictions, and most observed yields fall within the predicted intervals, supporting the reliability of the model for temporal extrapolation and uncertainty characterization.

3.3. Feature Importance Analysis and Mechanistic Interpretation

Feature importance was quantified using permutation importance on the held-out test set, measured by the increase in RMSE when each feature was randomly permuted. To investigate the relative contributions of different environmental factors to maize yield, we quantified feature importance in the proposed model (Figure 8). In Figure 8, features are ordered from highest to lowest importance (top to bottom), and the color intensity (darker green) is mapped to higher importance. The results indicate that the mean soil organic matter, the change in maximum precipitation, the change in soil organic carbon, and the tillage–residue management system were the four most influential features. Among them, mean soil organic matter ranked first, further confirming the fundamental role of soil fertility in yield formation. The change in maximum precipitation ranked second, suggesting that interannual variability in extreme rainfall events exerts a strong influence on maize yield. The change in soil organic carbon ranked third, highlighting the sustained regulatory role of soil carbon dynamics in yield formation. The tillage–residue management system, as a management-related variable, also ranked among the most important features, indicating that even after accounting for other environmental factors, different management systems continue to affect final yield through their distinct agronomic mechanisms.

Further analysis revealed that both the ranking and the direction of feature effects differed across tillage–residue management systems. Under NTS, features related to soil moisture conservation (e.g., rainfall evenness) exhibited relatively higher importance; whereas under PTS and RT, indicators associated with soil aeration and the availability of readily accessible nutrients played more prominent roles. These differences suggest that the management system does not influence yield in isolation; instead, it modifies the “soil–climate” system relationships and thereby alters the contribution pathways of environmental drivers to yield. The analysis also identified the importance of temporal distribution features, such as the standard deviation of rainfall frequency, implying that the regularity of rainfall timing can exert deeper influences on water-use efficiency and crop physiological processes.

Overall, this feature-importance assessment reveals heterogeneity in yield-formation mechanisms across tillage–residue management systems from the perspective of driving factors. It provides quantitative evidence for understanding management–environment interactions and offers guidance for targeted agronomic regulation strategies.

3.4. Multi-Factor Interactions and Three-Dimensional Relationship Analysis

Correlations were quantified using Pearson’s correlation coefficient (r), and significance was assessed using a two-sided t-test. To further elucidate the synergistic mechanisms through which soil, climate, and management factors jointly affect maize yield, we conducted analyses of multi-factor interactions and three-dimensional relationships. The correlation heatmap (Figure 9a) shows that soil organic carbon was positively correlated with yield (r = 0.470) and also exhibited a moderate positive correlation with precipitation (r = 0.420). This suggests that precipitation relates to crop growth directly and also co-varies with soil carbon status, which may jointly shape yield outcomes; however, correlation alone does not establish causality in this coupled climate–soil–yield system.

Based on this, the three-dimensional relationship plot (Figure 9b) provides a visual representation of the complex nonlinear interaction patterns among soil organic carbon, precipitation, and yield. In regions with higher soil organic carbon and suitable precipitation, yield reached its peak. By contrast, under low soil organic carbon, yield remained strongly constrained even when precipitation was sufficient. When precipitation was relatively high but soil organic carbon was at a moderate level, yield tended to be comparatively stable. These three-dimensional patterns indicate that soil fertility and water availability do not act independently; rather, they exhibit clear synergistic and compensatory effects. This finding further supports the integrated agronomic principle of “regulating fertility through water and promoting yield through fertility” in cropland management.

In summary, soil, climate, and management factors jointly regulate maize yield through multidimensional interactions, with the synergy between soil organic carbon and water availability playing a particularly critical role. In the long term, no-till with straw mulch demonstrates clear advantages in enhancing system stability and cumulative yield gains. These results provide theoretical support and practical guidance for adaptive selection of tillage–residue management systems and the development of long-term management strategies across different agro-ecological zones.

To provide quantitative evidence for climate-dependent management effects, we further summarized the yield advantage of NTS across growing-season precipitation regimes (Figure 10a; a boxplot of ΔYield, kg ha⁻¹, defined as yield_NTS − yield_PTS and yield_NTS − yield_RT across dry (<300 mm), normal (300–600 mm), and wet (>600 mm) groups; the box shows the interquartile range with the median line, and whiskers denote the data range excluding outliers). Overall, the yield gain of NTS over PTS was more pronounced under dry conditions (<300 mm) and attenuated under wet conditions (>600 mm), whereas NTS maintained a consistent advantage relative to RT across precipitation groups. In addition, SOC accumulation differed among systems (Figure 10b; a boxplot of annual SOC growth rate, %, under RT, PTS, and NTS; asterisks indicate significant differences among systems at p < 0.05), with NTS showing a higher SOC annual growth rate than PTS and RT, supporting its potential contribution to long-term soil fertility building and carbon sequestration.

4. Discussion

4.1. Mechanisms of Yield Formation Under Different Tillage–Residue Management Systems

Our results indicate that NTS provides a clear advantage in both yield level and stability. This pattern is consistent with prior evidence and can be interpreted through complementary physical, chemical, and biological mechanisms within the soil–crop system. Long-term field studies have shown that tillage and straw management can jointly regulate SOC dynamics, yield level, and yield stability, supporting a coupled soil–water–nutrient interpretation of management effects over time [31,32].

From a soil physical perspective, persistent surface residue cover under NTS reduces evaporative losses, moderates near-surface temperature, and improves water infiltration, thereby enhancing water availability to crops. Consistent with the climate-stratified analysis (Figure 10a), the yield advantage of NTS over PTS is more pronounced under drier growing-season precipitation (<300 mm) and diminishes under wetter conditions (>600 mm), highlighting the stronger water-conservation benefit of residue cover under water-stress scenarios. Field evidence also supports the role of conservation tillage in hedging against drought stress and stabilizing maize yields in dryland regions [33].

From a soil chemical and nutrient-cycling perspective, residue retention and slower decomposition can smooth the release of carbon and nitrogen, reduce nutrient leaching, and improve nutrient-use efficiency relative to intensive soil disturbance. In line with the SOC patterns (Figure 10b), NTS promotes SOC accumulation compared with PTS and RT, supporting its potential for long-term carbon sequestration and soil fertility building. This interpretation aligns with synthesis evidence that straw return and reduced disturbance increase SOC stocks across maize systems and that the magnitude of SOC gains depends on management and environmental conditions [34,35].

From a soil biological perspective, surface residue provides a favorable habitat and substrate for soil microorganisms, enhancing microbial diversity and activity. Elevated microbial activity can accelerate residue transformation and nutrient cycling and can also promote aggregate formation via microbially derived binding agents, forming a reinforcing “residue cover–microbes–soil structure–crop growth” feedback loop.

Together, these physical, chemical, and biological pathways provide a coherent mechanistic basis for the observed yield and stability advantages under NTS and align with the larger estimated system effect in our model. More broadly, the observed differences among tillage–residue management systems underscore that management practices influence yield not only through direct effects on the crop environment, but also by reshaping coupled soil–water–nutrient processes over time.

4.2. Climate-Driven Modulation of System Adaptability

A key contribution of this study is to reveal the dynamic modulation of yield effects across tillage–residue management systems by climatic conditions. Traditional studies often focus on average treatment effects, whereas our interaction-term analysis within the mixed-effects model indicates that the optimal system depends strongly on the prevailing meteorological context.

The modulating role of precipitation is particularly evident. In years with relatively uniform precipitation and moderate intensity, yield differences among the three systems are smaller; whereas under more uneven precipitation regimes or drier conditions, NTS exhibits a clearer yield advantage due to the water-retention and buffering effects of surface residue cover. This climate dependence is consistent with evidence that conservation practices interact with precipitation variability to influence soil moisture and maize productivity under contrasting precipitation scenarios, and that yield benefits of reduced disturbance and surface cover tend to be more pronounced under water-stress conditions. In addition, long-term analyses under variable weather show that no-till can enhance yield stability and reduce interannual yield variability when soil moisture is limiting, reinforcing the interpretation that water availability mediates management benefits. Moreover, quantitative syntheses demonstrate that mulching effects on yield are environment-dependent (e.g., varying with water input and temperature), supporting the need for scenario-based rather than “one-size-fits-all” recommendations [36,37,38].

Temperature also plays an important regulatory role. Under hot years, residue cover can mitigate heat stress by moderating near-surface temperature and reducing evaporative losses; whereas in cool years or when early-season temperatures are low, soil disturbance may accelerate soil warming and potentially favor crop establishment. Large-scale evidence indicates that crop and environmental factors—including precipitation regime, soil texture, and thermal conditions—systematically influence yield responses under no-till relative to tillage, highlighting that climate–soil context governs whether a management option is advantageous [39].

Overall, climatic conditions not only directly affect crop growth but also dynamically regulate the realized effects of management practices by altering response pathways within the “management system–soil–crop” continuum. This finding challenges the conventional recommendation logic based on “static superiority” and highlights the need to develop climate-smart, adaptive management strategies under climate change. The dynamic prediction and recommendation framework proposed in this study provides methodological and practical support for implementing such strategies.

4.3. Experimental Validation (AS-IS vs. To-BE) and Predictive Uncertainty Analysis

The proposed ME-LSTM integrated model achieves a principled integration of deep learning and classical statistical modeling, offering advantages over conventional approaches in multiple respects. In particular, compared with Transformer-based sequence models that typically require large-scale data to reliably learn attention patterns and avoid overfitting, ME-LSTM is better suited to agricultural datasets that are often small-sample, noisy, and hierarchically structured (site/plot × year). In such settings, Transformers may over-parameterize the temporal modeling problem and become sensitive to site-specific artifacts or sampling imbalance, which can degrade generalization across locations and years.

Beyond demonstrating system functionality, we further provide performance-based evidence for management-system improvement in an AS-IS versus To-BE manner. Here, conventional systems (RT and PTS) are treated as the AS-IS baseline, whereas NTS represents the To-BE option evaluated under the proposed scenario-aware framework. Using the pooled multi-ecological dataset, NTS outperformed RT and PTS in both yield level and stability: mean yield under NTS was more than 11% higher than under RT and PTS, and NTS exhibited the lowest interannual coefficient of variation, indicating stronger climate adaptability. Importantly, the advantage of NTS was climate-dependent. As shown in the precipitation-stratified comparison (Figure 10a), the yield gain of NTS relative to PTS was most pronounced under dry conditions and attenuated under wet conditions, whereas NTS maintained a consistent advantage relative to RT across precipitation groups. In addition, SOC accumulation differed among systems (Figure 10b), with NTS showing a higher annual SOC growth rate than PTS and RT, supporting its potential contribution to long-term soil fertility building and carbon sequestration. Together, these results demonstrate that the proposed framework moves beyond static prediction to quantitative, scenario-based management evaluation and recommendation.

The model innovations are mainly reflected in three aspects. First, the sliding-window design combined with the LSTM enables effective learning of temporal dependence and long-memory effects in yield formation, overcoming the limitation of traditional cross-sectional models that ignore cumulative impacts of historical management. This design also reduces the effective sequence length and stabilizes learning under limited samples, mitigating the data-hungry behavior often observed in attention-dominant architectures. Second, introducing the mixed-effects component allows the model to distinguish management fixed effects from hierarchical random effects associated with study/source, year, and plot, thereby improving interpretability and generalization. By explicitly accounting for hierarchical heterogeneity, ME-LSTM reduces the risk that a purely data-driven sequence model (including Transformer) confounds management effects with unobserved site/year variability. Third, the model explicitly captures interactions between management systems and climatic/soil factors, enabling not only yield prediction but also evaluation of the relative advantages of different management options under specific environmental conditions—thus achieving a functional shift from “prediction” to “recommendation.” In addition, the attention mechanism and L2-regularized embedding layers further enhance interpretability and training stability.

In terms of predictive performance, the model achieved an R² of 0.8989 and an RMSE of 309.83 kg ha⁻¹ on the test set, indicating strong predictive accuracy. Nevertheless, a degree of uncertainty remains. Error analysis shows that the mean width of the 95% prediction interval is approximately 600 kg ha⁻¹, implying that predictions can still vary in regions with strong climatic fluctuations or high soil heterogeneity. This uncertainty may arise from (1) uncertainty in climate inputs, particularly the difficulty of fully capturing extreme weather events; (2) incomplete quantification of spatial soil variability; and (3) stochasticity in crop physiological processes and biotic stresses such as pests and diseases. In addition, remote sensing observations may introduce noise due to cloud cover and sensor limitations.

To reduce predictive uncertainty and enhance practical utility, future work could: (1) integrate additional environmental data (e.g., microbial activity, solar radiation, wind speed) to better characterize system states; (2) adopt ensemble learning or Bayesian deep learning frameworks to quantify uncertainty more directly; and (3) incorporate crop-growth process modules to develop hybrid “process–data” models that improve extrapolation under extreme scenarios. Overall, the proposed model achieves a favorable balance between predictive accuracy and mechanistic interpretability, providing a reliable and actionable tool for dynamic agricultural management decisions.

4.4. Limitations and Future Directions

Although the proposed ME-LSTM demonstrates promising predictive accuracy and interpretability, several limitations remain, which also indicate directions for future improvement.

At the data level, although this study integrates long-term multi-site experiments across major agro-ecological zones, the temporal span is still relatively limited (9 years), which may be insufficient to fully capture slow processes such as soil carbon saturation and long-term climate trends with nonlinear effects. In addition, the dataset is mainly concentrated in major maize-producing regions; representation of marginal production areas or special soil types (e.g., saline–alkali soils and sandy soils) remains limited. Future work could expand observation networks, extend experimental duration, and integrate public agricultural databases to build more comprehensive, longer-term multisource datasets.

At the modeling and methodological level, while remote sensing and ground-based time-series data were incorporated, key physiological processes such as canopy fluorescence, canopy temperature, and root dynamics are still not adequately represented. Moreover, the current model focuses on yield as the primary target and does not systematically incorporate environmental outcomes (e.g., carbon footprint, nitrate leaching) or economic cost assessments. Future studies could integrate emerging monitoring data from UAVs and proximal sensors and develop multi-objective optimization frameworks to jointly consider yield, environmental sustainability, and economic benefits.

Future research may focus on: (i) developing region-scale adaptive prediction platforms that integrate dynamic recommendations for multiple crops and multiple tillage–residue management systems; (ii) advancing interpretable “intelligent agriculture models” by coupling process-based crop models with machine learning to improve mechanistic grounding and extrapolation; and (iii) promoting a closed-loop paradigm of “data–model–decision–feedback” to enable truly digital and intelligent transformation of agricultural management. By addressing these limitations, the proposed methodology is expected to provide stronger scientific evidence and tool support for climate-change adaptation and agricultural sustainability.

5. Conclusions

By developing a dynamic forecasting framework that couples multi-year temporal reconstruction with a mixed-effects spatiotemporal sequence model (ME-LSTM), this study enables yield prediction and management system evaluation under hierarchical heterogeneity across sites and years. The framework explicitly separates management fixed effects from hierarchical random effects and parameterizes management × climate/soil interactions, thereby supporting scenario-based comparison and recommendation of tillage–residue management options. Using multi-ecological maize datasets, NTS outperformed PTS and RT in both yield level and stability: mean yield under NTS was more than 11% higher than under PTS and RT, and NTS exhibited the lowest interannual coefficient of variation, indicating stronger climate adaptability. On the test set, the proposed model achieved an R² of 0.8989 and an RMSE of 309.83 kg ha⁻¹, demonstrating high predictive accuracy and robustness. Feature-importance analysis identified SOC dynamics, yield legacy effects, and rainfall heterogeneity as key drivers of yield variability. However, the value of the proposed approach is not merely restating that yield is jointly influenced by management and environment, which is well-known. Rather, this study is motivated by a practical challenge under climate change: shifting precipitation regimes and more frequent extremes can change the relative advantage of tillage–residue options, so a system that performs well historically may not remain optimal across future conditions. Beyond these general agronomic insights, the core originality of this work is the end-to-end integration of a hierarchical mixed-effects structure into a temporal deep learning model, which is necessary for multi-source data fusion because it reduces confounding between management effects and site/year/source heterogeneity (i.e., it distinguishes true management impacts from unobserved location- and year-specific variability when pooling experimental and literature datasets) and enables interpretable, scenario-specific management recommendation from multi-source data. Overall, the proposed framework moves beyond static yield prediction toward scenario-aware management evaluation, providing an actionable tool for climate-adaptive, risk-aware decision support.

Author Contributions

Z.Z.: Writing—original draft, Methodology, Software, Formal analysis, Data curation, Conceptualization, Funding acquisition. M.G.: Formal analysis, Software, Validation. N.L.: Data curation, Validation. J.D.: Data curation, Resources. Y.L.: Visualization, Software. Z.H.: Supervision, Funding acquisition. X.Y.: Investigation, Data curation. Z.D.: Writing—review and editing, Supervision, Project administration, Funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program of China (2022YFD1500700); the Sub-Project of the National Key Research and Development Program of China (2022YFD1500703-02); and Sci-Tech Innovation Special Program of the Liaoning Academy of Agricultural Science (2026QN2106 and 2026JC4010).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Oikonomidis, A.; Catal, C.; Kassahun, A. Deep Learning for Crop Yield Prediction: A Systematic Literature Review. N. Z. J. Crop Hortic. Sci. 2023, 51, 1–26. [Google Scholar] [CrossRef]
Peng, D.; Cheng, E.; Feng, X.; Hu, J.; Lou, Z.; Zhang, H.; Zhao, B.; Lv, Y.; Peng, H.; Zhang, B. A Deep–Learning Network for Wheat Yield Prediction Combining Weather Forecasts and Remote Sensing Data. Remote Sens. 2024, 16, 3613. [Google Scholar] [CrossRef]
Schmitt, J.; Offermann, F.; Söder, M.; Frühauf, C.; Finger, R. Extreme Weather Events Cause Significant Crop Yield Losses at the Farm Level in German Agriculture. Food Policy 2022, 112, 102359. [Google Scholar] [CrossRef]
Islam, M.U.; Jiang, F.; Guo, Z.; Liu, S.; Peng, X. Impacts of Straw Return Coupled with Tillage Practices on Soil Organic Carbon Stock in Upland Wheat and Maize Croplands in China: A Meta-Analysis. Soil Tillage Res. 2023, 232, 105786. [Google Scholar] [CrossRef]
Khaki, S.; Wang, L. Crop Yield Prediction Using Deep Neural Networks. Front. Plant Sci. 2019, 10, 621. [Google Scholar] [CrossRef]
Torgbor, B.A.; Rahman, M.M.; Brinkhoff, J.; Sinha, P.; Robson, A. Integrating Remote Sensing and Weather Variables for Mango Yield Prediction Using a Machine Learning Approach. Remote Sens. 2023, 15, 3075. [Google Scholar] [CrossRef]
Conradt, T. Choosing Multiple Linear Regressions for Weather-Based Crop Yield Prediction with ABSOLUT v1.2 Applied to the Districts of Germany. Int. J. Biometeorol. 2022, 66, 2287–2300. [Google Scholar] [CrossRef]
Lobell, D.B.; Burke, M.B. On the Use of Statistical Models to Predict Crop Yield Responses to Climate Change. Agric. For. Meteorol. 2010, 150, 1443–1452. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, Z.; Tao, F. Improving Regional Winter Wheat Yield Estimation through Assimilation of Phenology and Leaf Area Index from Remote Sensing Data. Eur. J. Agron. 2018, 101, 163–173. [Google Scholar] [CrossRef]
Chandrasiri, C.K.; Tsusaka, T.W.; Ho, T.D.N.; Zulfiqar, F.; Datta, A. Impacts of Climate Change on Paddy Yields in Different Climatic Zones of Sri Lanka: A Panel Data Approach. Asia-Pac. J. Reg. Sci. 2023, 7, 455–489. [Google Scholar] [CrossRef]
Shook, J.; Gangopadhyay, T.; Wu, L.; Ganapathysubramanian, B.; Sarkar, S.; Singh, A.K. Crop Yield Prediction Integrating Genotype and Weather Variables Using Deep Learning. PLoS ONE 2021, 16, e0252402. [Google Scholar] [CrossRef]
Zhuo, W.; Fang, S.; Gao, X.; Wang, L.; Wu, D.; Fu, S.; Wu, Q.; Huang, J. Crop Yield Prediction Using MODIS LAI, TIGGE Weather Forecasts and WOFOST Model: A Case Study for Winter Wheat in Hebei, China during 2009–2013. Int. J. Appl. Earth Obs. Geoinf. 2022, 106, 102668. [Google Scholar] [CrossRef]
Dhakar, R.; Sehgal, V.K.; Chakraborty, D.; Sahoo, R.N.; Mukherjee, J.; Ines, A.V.M.; Kumar, S.N.; Shirsath, P.B.; Roy, S.B. Field Scale Spatial Wheat Yield Forecasting System under Limited Field Data Availability by Integrating Crop Simulation Model with Weather Forecast and Satellite Remote Sensing. Agric. Syst. 2022, 195, 103299. [Google Scholar] [CrossRef]
Khan, S.N.; Li, D.; Maimaitijiang, M. Using Gross Primary Production Data and Deep Transfer Learning for Crop Yield Prediction in the US Corn Belt. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103965. [Google Scholar] [CrossRef]
Joshi, A.; Pradhan, B.; Chakraborty, S.; Varatharajoo, R.; Gite, S.; Alamri, A. Deep-Transfer-Learning Strategies for Crop Yield Prediction Using Climate Records and Satellite Image Time-Series Data. Remote Sens. 2024, 16, 4804. [Google Scholar] [CrossRef]
Huber, F.; Inderka, A.; Steinhage, V. Leveraging Remote Sensing Data for Yield Prediction with Deep Transfer Learning. Sensors 2024, 24, 770. [Google Scholar] [CrossRef]
Jácome Galarza, L.; Realpe, M.; Viñán-Ludeña, M.S.; Calderón, M.F.; Jaramillo, S. AgriTransformer: A Transformer-Based Model with Attention Mechanisms for Enhanced Multimodal Crop Yield Prediction. Electronics 2025, 14, 2466. [Google Scholar] [CrossRef]
Yan, Y.; Li, H.; Zhang, M.; Liu, X.; Zhang, L.; Wang, Y.; Yang, M.; Cai, R. Straw Return or No Tillage? Comprehensive Meta-Analysis Based on Soil Organic Carbon Contents, Carbon Emissions, and Crop Yields in China. Agronomy 2024, 14, 2263. [Google Scholar] [CrossRef]
Filippi, P.; Jones, E.J.; Wimalathunge, N.S.; Somarathna, P.D.S.N.; Pozza, L.E.; Ugbaje, S.U.; Jephcott, T.G.; Paterson, S.E.; Whelan, B.M.; Bishop, T.F.A. An Approach to Forecast Grain Crop Yield Using Multi-Layered, Multi-Farm Data Sets and Machine Learning. Precis. Agric. 2019, 20, 1015–1029. [Google Scholar] [CrossRef]
Zuur, A.F.; Ieno, E.N.; Walker, N.; Saveliev, A.A.; Smith, G.M. Mixed Effects Models and Extensions in Ecology with R; Statistics for Biology and Health; Springer: New York, NY, USA, 2009; ISBN 978-0-387-87457-9. [Google Scholar]
Harrison, X.A.; Donaldson, L.; Correa-Cano, M.E.; Evans, J.; Fisher, D.N.; Goodwin, C.E.D.; Robinson, B.S.; Hodgson, D.J.; Inger, R. A Brief Introduction to Mixed Effects Modelling and Multi-Model Inference in Ecology. PeerJ 2018, 6, e4794. [Google Scholar] [CrossRef]
Lu, X. A Meta-Analysis of the Effects of Crop Residue Return on Crop Yields and Water Use Efficiency. PLoS ONE 2020, 15, e0231740. [Google Scholar] [CrossRef]
Qin, W.; Niu, L.; You, Y.; Cui, S.; Chen, C.; Li, Z. Effects of Conservation Tillage and Straw Mulching on Crop Yield, Water Use Efficiency, Carbon Sequestration and Economic Benefits in the Loess Plateau Region of China: A Meta-Analysis. Soil Tillage Res. 2024, 238, 106025. [Google Scholar] [CrossRef]
Oberpriller, J.; De Souza Leite, M.; Pichler, M. Fixed or Random? On the Reliability of Mixed-effects Models for a Small Number of Levels in Grouping Variables. Ecol. Evol. 2022, 12, e9062. [Google Scholar] [CrossRef]
Tucker, C.J. Red and Photographic Infrared Linear Combinations for Monitoring Vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A Modified Soil Adjusted Vegetation Index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
Williams, A.; Jordan, N.R.; Smith, R.G.; Hunter, M.C.; Kammerer, M.; Kane, D.A.; Koide, R.T.; Davis, A.S. A Regionally-Adapted Implementation of Conservation Agriculture Delivers Rapid Improvements to Soil Properties Associated with Crop Yield Stability. Sci. Rep. 2018, 8, 8467. [Google Scholar] [CrossRef]
Xu, J.; Han, H.; Ning, T.; Li, Z.; Lal, R. Long-Term Effects of Tillage and Straw Management on Soil Organic Carbon, Crop Yield, and Yield Stability in a Wheat-Maize System. Field Crops Res. 2019, 233, 33–40. [Google Scholar] [CrossRef]
Deng, Z.; Huang, M.; Zhang, W.; Wang, G.; Huang, X.; Liang, G.; Li, N. Effects of Five Years Conservation Tillage for Hedging against Drought, Stabilizing Maize Yield, and Improving Soil Environment in the Drylands of Northern China. PLoS ONE 2023, 18, e0282359. [Google Scholar] [CrossRef] [PubMed]
Xin, J.; Yan, L.; Cai, H. Response of Soil Organic Carbon to Straw Return in Farmland Soil in China: A Meta-Analysis. J. Environ. Manag. 2024, 359, 121051. [Google Scholar] [CrossRef] [PubMed]
Li, B.; Liang, F.; Wang, Y.; Cao, W.; Song, H.; Chen, J.; Guo, J. Magnitude and Efficiency of Straw Return in Building up Soil Organic Carbon: A Global Synthesis Integrating the Impacts of Agricultural Managements and Environmental Conditions. Sci. Total Environ. 2023, 875, 162670. [Google Scholar] [CrossRef]
Niu, L.; Qin, W.; You, Y.; Mo, Q.; Pan, J.; Tian, L.; Xu, G.; Chen, C.; Li, Z. Effects of Precipitation Variability and Conservation Tillage on Soil Moisture, Yield and Quality of Silage Maize. Front. Sustain. Food Syst. 2023, 7, 1198649. [Google Scholar] [CrossRef]
Mathers, C.; Heitman, J.; Huseth, A.; Locke, A.; Osmond, D.; Woodley, A. No-till Imparts Yield Stability and Greater Cumulative Yield under Variable Weather Conditions in the Southeastern USA Piedmont. Field Crops Res. 2023, 292, 108811. [Google Scholar] [CrossRef]
Qin, W.; Hu, C.; Oenema, O. Soil Mulching Significantly Enhances Yields and Water and Nitrogen Use Efficiencies of Maize and Wheat: A Meta-Analysis. Sci. Rep. 2015, 5, 16210. [Google Scholar] [CrossRef]
Toliver, D.K.; Larson, J.A.; Roberts, R.K.; English, B.C.; De La Torre Ugarte, D.G.; West, T.O. Effects of No-Till on Yields as Influenced by Crop and Environmental Factors. Agron. J. 2012, 104, 530–541. [Google Scholar] [CrossRef]

Figure 1. Schematic map of the experimental sites.

Figure 2. Model architecture.

Figure 3. Screenshot of the interactive dashboard. (a) Main interface; (b) Prediction interface.

Figure 4. Histogram and boxplot of yield distribution. (a) Yield histograms across years; (b) Yield boxplots across management systems.

Figure 5. Temporal trends in maize yield.

Figure 6. Training and validation diagnostics of ME-LSTM. (a) Training and validation loss curves across epochs. (b) Distribution of prediction errors on the validation set (normal fit shown; the red dashed line denotes zero error). (c) Predicted versus observed (actual) yield for 50 randomly selected validation samples (red dashed line: 1:1 reference). (d) Residuals versus predicted yield for the same samples (red dashed line: zero residual).

Figure 7. Prediction results: (a) yield prediction; (b) prediction interval curves.

Figure 8. Ranked permutation feature importance for maize yield prediction. Features are ordered from highest to lowest importance; darker green indicates higher importance. _mean/_std/_diff denote three-year mean/SD/change; SOM = soil organic matter, SOC = soil organic carbon.

Figure 9. (a) Correlation heatmap of key features. (b) 3D scatter of SOC, average rainfall, and maize yield, colored by management system (RT, PTS, NTS).

Figure 10. (a) ΔYield of NTS relative to PTS and RT across precipitation groups. (b) Annual SOC growth rate under RT, PTS, and NTS (Asterisks indicate statistically significant differences among management systems based on Tukey-adjusted post hoc tests p < 0.05).

Table 1. Spectral vegetation indices and formulas (computed from Landsat surface reflectance).

Spectral Index	Formula (Using NIR, RED, and BLUE Reflectance)	References
Normalized Difference Vegetation Index (NDVI)	(NIR − RED)/(NIR + RED)	[25]
Soil-Adjusted Vegetation Index (SAVI)	(1 + L) × (NIR − RED)/(NIR + RED + L), with L = 0.5	[26]
Modified Soil-Adjusted Vegetation Index (MSAVI)	$[2 \times NIR + 1 - \sqrt{{(2 \times NIR + 1)}^{2}}$ − 8 × (NIR − RED)]/2	[27]
Enhanced Vegetation Index (EVI)	2.5 × (NIR − RED)/(NIR + 6 × RED − 7.5 × BLUE + 1)	[28]

Note: For Landsat 5 TM/Landsat 7 ETM+, NIR = Band 4, RED = Band 3, BLUE = Band 1; for Landsat 8 OLI, NIR = Band 5, RED = Band 4, BLUE = Band 2.

Table 2. Descriptive statistics of key variables under different tillage–residue management systems.

Variable	RT	NTS	PTS	Overall
Soil organic carbon (g kg⁻¹)	18.50 ± 0.87 ^c (CV = 4.5%)	26.32 ± 2.54 ^a (CV = 9.1%)	21.04 ± 1.69 ^b (CV = 7.6%)	21.96 ± 3.76 (CV = 16.8%)
Soil organic matter (g kg⁻¹)	31.89 ± 1.51 ^c (CV = 4.5%)	45.38 ± 4.38 ^a (CV = 9.1%)	36.28 ± 2.92 ^b (CV = 7.6%)	37.85 ± 6.48 (CV = 16.8%)
Mean temperature (°C)	-	-	-	19.29 ± 0.19 (CV = 0.9%)
Mean precipitation (mm)	-	-	-	375.02 ± 288.93 (CV = 72.6%)

Note: Values are mean ± SD (CV in parentheses). Different superscript letters within a row indicate significant differences among systems based on the pooled dataset (LMM with Tukey-adjusted post hoc comparisons, p < 0.05). Mean temperature and precipitation statistics were calculated across site–year records and are identical across management systems within the same site–year. They are therefore reported only once in the “Overall” column to avoid duplication and misinterpretation.

Table 3. Model performance comparison on the test set.

Model	R²	RMSE (kg ha⁻¹)	MAE (kg ha⁻¹)	MAPE (%)
ME-LSTM	0.8989	309.83	245.18	2.14
ME-LSTM w/o remote sensing	0.8265	451.76	362.40	3.31
ME-LSTM w/o ground time series	0.7812	521.89	418.55	3.82
LSTM	0.7006	741.36	674.12	5.14
GRU	0.6841	843.54	794.17	6.19
BiGRU	0.6415	941.27	894.36	7.45
Transformer	0.6004	1100.14	1074.82	9.63

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Gan, M.; Li, N.; Dong, J.; Liu, Y.; Hou, Z.; Yue, X.; Dong, Z. Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling. Agronomy 2026, 16, 584. https://doi.org/10.3390/agronomy16050584

AMA Style

Zhang Z, Gan M, Li N, Dong J, Liu Y, Hou Z, Yue X, Dong Z. Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling. Agronomy. 2026; 16(5):584. https://doi.org/10.3390/agronomy16050584

Chicago/Turabian Style

Zhang, Zhenzi, Miao Gan, Na Li, Jun Dong, Yang Liu, Zhiyan Hou, Xingyu Yue, and Zhi Dong. 2026. "Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling" Agronomy 16, no. 5: 584. https://doi.org/10.3390/agronomy16050584

APA Style

Zhang, Z., Gan, M., Li, N., Dong, J., Liu, Y., Hou, Z., Yue, X., & Dong, Z. (2026). Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling. Agronomy, 16(5), 584. https://doi.org/10.3390/agronomy16050584

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Evaluation of Tillage–Residue Management Systems and Maize Yield Prediction via Multi-Source Data Fusion and Mixed-Effects Modeling

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Sites and Data Sources

2.2. Sliding-Window Temporal Reconstruction and Dataset Construction

2.3. Architecture of the Mixed-Effects LSTM Model (ME-LSTM)

2.4. Model Evaluation and Validation

2.5. Visual Decision Support Prototype

3. Results and Analysis

3.1. Yield Performance Under Different Tillage–Residue Management Systems

3.2. Evaluation of Model Predictive Performance

3.3. Feature Importance Analysis and Mechanistic Interpretation

3.4. Multi-Factor Interactions and Three-Dimensional Relationship Analysis

4. Discussion

4.1. Mechanisms of Yield Formation Under Different Tillage–Residue Management Systems

4.2. Climate-Driven Modulation of System Adaptability

4.3. Experimental Validation (AS-IS vs. To-BE) and Predictive Uncertainty Analysis

4.4. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI