A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input

Liu, Shu-Chu; Lin, Yan-Jing; Chung, Chih-Hung; Wen, Hsien-Yin

doi:10.3390/su18083806

Open AccessArticle

A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input

¹

Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung 912301, Taiwan

²

Department of Educational Technology, Tamkang University, New Taipei City 251301, Taiwan

^*

Authors to whom correspondence should be addressed.

Sustainability 2026, 18(8), 3806; https://doi.org/10.3390/su18083806

Submission received: 5 February 2026 / Revised: 5 April 2026 / Accepted: 8 April 2026 / Published: 11 April 2026

(This article belongs to the Special Issue AI for Sustainable Supply Chain-Driven Business Transformation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Accurate crop yield prediction is fundamental to sustainable agricultural management, enabling optimized resource allocation and informed decision-making. However, a critical gap exists in current prediction models: existing approaches overlook the temporal alignment between meteorological conditions and production management phases—defined as the intervals between consecutive agronomic operations (e.g., sowing, fertilization, thinning). This oversight results in suboptimal predictive performance, as conventional whole-season weather aggregation fails to capture phase-sensitive crop–weather interactions. While machine learning (e.g., XGBoost) and deep learning approaches (e.g., CNN, LSTM) have been applied to yield prediction, these models typically treat weather variables as temporally homogeneous inputs, inadequately modeling the correlation between historical yields and phase-specific meteorological patterns. To address this gap, this study proposes CNN-LSTM-AM, an innovative hybrid deep learning model that integrates convolutional neural networks (CNNs), long short-term memory (LSTM), and attention mechanisms (AMs), utilizing weather data explicitly aligned with production management phases as input. The CNN component extracts cross-phase weather patterns, the LSTM captures sequential dependencies across growth stages, and the attention mechanism dynamically weights phase importance based on meteorological conditions. The proposed model is validated using a real-world case study of Bok choy production from an agricultural cooperative in Yunlin County, Taiwan, encompassing 1714 production cycles over eight years (2011–2019). Experimental results demonstrate that CNN-LSTM-AM achieves an RMSE of 1448.24 kg/ha, MAPE of 3.60%, and R² of 0.98, outperforming five baseline models—CNN (RMSE = 2919.18), LSTM (RMSE = 2529.74), CNN-LSTM (RMSE = 1516.44), LSTM-AM (RMSE = 2284.64), and XGBoost (RMSE = 3452.47)—representing a notable reduction in prediction error (58% lower RMSE) compared to XGBoost. Furthermore, prediction accuracy improves progressively as harvest time approaches, and phase-specific weather encoding enhances accuracy by 16.5% compared to whole-season averaging. These findings underscore the critical importance of integrating agronomic domain knowledge into data-driven prediction frameworks.

Keywords:

crop yield prediction model; sustainable agricultural management; machine learning (ML); hybrid deep learning; convolutional neural network (CNN); long short-term memory (LSTM); attention mechanism (AM); weather data associated with production management phases

1. Introduction

Accurate crop yield prediction is a critical challenge for precision agriculture [1,2]. Historically, yield prediction has depended on empirical information and experience, with farmers and agronomists making judgments based on historical trends and intuition. Recently, the digitization of agriculture has facilitated the systematic documentation of agricultural activities (e.g., sowing, fertilization, thinning, harvesting) and meteorological data (e.g., temperature, precipitation, humidity). Due to the proliferation of data, machine learning models are increasingly being implemented. Nevertheless, meteorological data pertinent to production management phases are often excluded from current crop yield forecasting models [1,2]. The impact of weather variables can vary across these phases (for example, excessive rainfall within 2–3 days after fertilization may lead to nutrient loss, while a high temperature before harvest time can reduce yield). Ignoring these phase-specific interactions can lead to reduced prediction accuracy [1,2]. Thus, predicting crop output by clarifying the complex and dynamic relationships between phase-specific meteorological factors and yield outcomes has emerged as a critical problem for achieving agricultural sustainability [1]. Machine learning models are limited in their ability to capture the complex, nonlinear, and temporally dynamic relationships inherent in crop production systems [1,2]. For example, tree-based machine learning models like XGBoost can model nonlinearities but often lack the capacity to represent temporal dependencies and interactions between production phases and weather variables [1,2]. Consequently, these models may exclude essential phase-specific weather effects, such as elevated temperatures prior to harvest, which can substantially affect yield results. Deep learning (DL) models, particularly convolutional neural network (CNN) and long short-term memory (LSTM) networks, have emerged as powerful tools for crop yield prediction, capable of capturing both spatial and temporal dependencies in weather data [1,2]. CNN excels at extracting spatial features from weather data, while LSTM is adept at modeling temporal sequences, such as weather patterns during crop management phases. However, single-model approaches have inherent limitations: CNN may fail to capture temporal relationships, and LSTM may inadequately represent phase-related correlations, particularly those arising from abrupt weather events during specific production management phases [1,2]. To overcome those drawbacks mentioned above, hybrid deep learning models that combine CNN and LSTM have been developed [3,4]. These models leverage the strengths of each component: CNN extracts phase-related weather features, and LSTM models temporal dependencies. The attention mechanism (AM) dynamically assesses the significance of meteorological variables during production management phases, thereby addressing the limitations of LSTM: equal treatment of all phases/features and a tendency to forget earlier information in long sequences; it enhances predictive accuracy [5]. Recently, the attention mechanism (AM) has increasingly been adopted and integrated with CNN-LSTM models to improve yield prediction performance [5,6]. Empirical studies have demonstrated that such hybrid models significantly outperform traditional and single-model approaches [4].

Beyond CNN-LSTM hybrids, recent advances in temporal modeling have introduced alternative architectures for crop yield prediction. Transformer-based models, which leverage self-attention mechanisms to capture long-range dependencies without recurrent structures, have demonstrated promising results. For instance, MMST-ViT integrates satellite imagery and meteorological data to capture both short-term weather variations and long-term climate change effects on crop growth [7]. A Transformer-based approach for early soybean yield prediction using time-series images achieved up to 40% reduction in RMSE compared to baseline models [8]. Temporal Fusion Transformers (TFTs), which combine high-performance forecasting with interpretable feature importance, have also been adopted for wheat yield prediction [9]. Additionally, Temporal Convolutional Networks (TCNs), which employ dilated causal convolutions to efficiently capture long-range temporal patterns, have shown competitive performance. A TCN-based model for rice yield prediction using multispectral satellite data and climatic parameters has been proposed [10], while the effectiveness of combining RNN and TCN for greenhouse crop yield prediction has also been demonstrated [11]. A TCNT framework integrating TCN and Transformer architectures has been proposed for enhanced crop yield forecasting [12]. Furthermore, multimodal approaches that fuse weather data with soil properties, remote sensing indices, and IoT sensor measurements have been explored to capture more comprehensive crop–environment interactions [6,13]. While these advanced architectures offer enhanced modeling capabilities, they typically require large-scale multi-source datasets (e.g., satellite imagery, soil maps) that may not be available for all agricultural contexts. The present study focuses on a practical scenario using weather station data aligned with agronomic management phases, which is a more accessible data configuration for smallholder farming cooperatives.

This research offers a hybrid deep learning model for predicting crop yield, integrating CNN, LSTM, and AM, and utilizing weather data related to production management phases as input. The hybrid deep learning model proficiently delineates the intricate, phase-specific correlations between meteorological variables and yield results, thus addressing the previously described limitations.

2. Contribution

The main contributions of this study are summarized as follows: (1) Phase-specific weather encoding framework: Unlike conventional approaches that aggregate meteorological variables over the entire growing season, we propose a novel framework that aligns weather data with production management phases—the intervals between consecutive agronomic operations (e.g., sowing, fertilization, thinning). This phase-specific encoding enables the model to capture crop–weather interactions that are sensitive to the timing of agricultural management, which existing whole-season aggregation methods fail to represent. (2) CNN-LSTM-AM hybrid architecture: We develop a hybrid deep learning model that integrates CNN for extracting cross-phase weather patterns, LSTM for modeling temporal dependencies across sequential growth stages, and an additive attention mechanism for dynamically weighting phase importance. This architecture outperforms single-model approaches (CNN, LSTM), simpler hybrids (CNN-LSTM), and traditional machine learning (XGBoost) by effectively capturing both local phase-level features and global temporal relationships. (3) Empirical validation with real-world production data: The proposed model is validated using 1714 production cycles of Bok choy collected over eight years (2011–2019) from an agricultural cooperative in Yunlin County, Taiwan. The large-scale, multi-year dataset provides robust empirical evidence for model effectiveness under real agricultural conditions. (4) Demonstrated superiority of phase-aware prediction: Experimental results show that phase-specific weather encoding improves prediction accuracy by 16.5% compared to whole-season averaging, while achieving a 58% reduction in prediction error relative to XGBoost (RMSE = 1448.24 kg/ha, MAPE = 3.60%, R² = 0.98). Moreover, progressive improvement in accuracy as harvest time approaches provides actionable insights for supply chain planning at multiple decision points.

3. Data and Models

3.1. The Data for the Crop Yield Prediction Model

Bok choy, a highly esteemed and significant vegetable in Taiwan, was chosen to model and validate the proposed model for predicting crop yield. The dataset comprises 1714 production cycles of Bok choy collected from November 2011 to July 2019. The production records and weather data were obtained from a prominent fruit and vegetable production cooperative located in Yunlin County, Taiwan, and the nearby Central Weather Administration (CWA) meteorological station, respectively. Some selected integrated data samples from the production records are listed in Table 1 (including 5 common production operations: sowing, fertilization I, thinning, fertilization II, and harvest. All data samples are available upon request). These production cycles were obtained from 37 distinct farm plots. Each production cycle spans approximately 25 to 50 days from sowing to harvest.

Based on the existing literature, the primary meteorological factors influencing crop yield include temperature, rainfall, humidity, solar radiation, wind speed, and atmospheric pressure [1,2]. Temperature significantly affects crop development and productivity, while rainfall directly impacts soil moisture availability and nutrient uptake. Humidity influences transpiration rates and disease susceptibility [14], and solar radiation drives photosynthesis and dry matter accumulation. As total sky radiation is functionally equivalent to solar radiation under Taiwan’s climatic conditions, it was adopted in this study as a proxy measure. Additionally, average wind speed affects canopy microclimate and evapotranspiration.

A critical limitation of prior crop yield prediction models is the insufficient consideration of agronomic production processes [1,2]. In practice, the cultivation of Bok choy typically comprises five sequential operations: sowing, first fertilization (fertilization I), thinning, second fertilization (fertilization II), and harvest [14,15,16]. To capture the temporal dynamics of weather–crop interactions throughout the growing cycle, these operations were used to delineate four production management phases: Phase 1 (sowing → fertilization I), Phase 2 (fertilization I → thinning), Phase 3 (thinning → fertilization II), and Phase 4 (fertilization II → harvest). It should be noted that these phases are defined by the timing of agronomic operations rather than biological phenological stages. While Phase 1 broadly corresponds to seedling establishment and Phase 4 aligns with the rapid biomass accumulation period, the scheduling of fertilization and thinning is primarily governed by agronomic practice rather than developmental cues. This distinction has implications for model generalizability: for crops exhibiting highly variable phenology, the correspondence between management phases and phenological stages may diverge considerably.

Daily meteorological observations were obtained from the nearest Central Weather Administration (CWA) station and temporally aligned with each production management phase. For each of the six weather variables, phase-level aggregations were computed: mean values were calculated for temperature, humidity, wind speed, and atmospheric pressure, whereas rainfall and solar radiation were treated as cumulative sums to reflect total environmental inputs per phase. This procedure yielded a 24-dimensional feature vector (6 variables × 4 phases) for each production cycle, serving as the model input at production cycle t, with the corresponding output representing Bok choy yield (kg/ha). Table 2 retains the complete variable list, whereas Figure 1 presents how these variables are transformed across the integrated architecture with explicit dimensional transitions.

To ensure data integrity, a rigorous quality-filtering protocol was applied. Production cycles with more than one day of missing meteorological data were excluded from the dataset, while those with exactly one day of missing data were retained with linear interpolation applied to preserve temporal continuity. The percentage of missing values by variable and phase is reported in Table 3. Outliers were subsequently identified using the Interquartile Range (IQR) method and winsorized by clipping observations to the upper (Q3 + 1.5 × IQR) and lower (Q1 − 1.5 × IQR) bounds, thereby mitigating the influence of extreme values without reducing sample size. Finally, all features were standardized using Z-score normalization, which was preferred over Min–Max scaling due to its greater robustness to the heterogeneous scales of meteorological variables and the presence of potential outliers. To prevent data leakage, all normalization parameters were derived exclusively from the training set. Ablation experiments confirmed that the combined application of winsorization and Z-score normalization yielded optimal predictive performance (Appendix B), achieving RMSE = 1448.24, MAPE = 3.6%, and R² = 0.98. The pseudocode for the phase-segmentation procedure is provided in Appendix A.

3.2. The Proposed Crop Yield Prediction Model CNN-LSTM-AM

This paper develops a crop yield prediction model, CNN-LSTM-AM, that integrates local feature extraction, temporal dependency modeling, and adaptive phase weighting within a single framework. The 24-dimensional phase-aligned meteorological vector is first processed by Conv1D and MaxPooling1D to extract salient cross-phase patterns, then encoded by LSTM, and finally summarized by an additive attention mechanism before regression. To avoid repetitive conceptual description, Figure 1 presents the complete integrated architecture together with the dimensional transitions across stages, while Section 3.2.1, Section 3.2.2 and Section 3.2.3 provide the mathematical details of each module.

3.2.1. CNN

CNN first processes (scan) the weather features over phases (six weather variables associated with four phases mentioned in Section 3.1) and learns local patterns across consecutive phases (Convolution Layer). Among consecutive phases, it keeps the “strongest signal” for each learned pattern (Pooling Layer). Finally, it outputs important cross-phase weather patterns for yield prediction (for example, high temperature and low humidity found across both the first phase and the second phase).

The Convolution Layer

Each convolution layer contains a plurality of convolution kernel, and its calculation is shown in Equation (1).

l_{t} = ReLU (x_{t} \times k_{t} + b_{t})

(1)

where l_t is the output value after convolution, ReLU is the activation function (the Rectified Linear Unit is adopted in the CNN layer to capture complex spatial patterns while maintaining training efficiency, as detailed in the architecture description above), x_t is the input vector, k_t is the weight of the convolution kernel, and b_t is the bias of the convolution kernel.

The Pooling Layer

The pooling layer processes the outputs from the convolutional layer. In this paper, a max-pooling layer is applied to the convolutional outputs for down sampling, which is represented by Equation (2).

P_{t} = m a x (l_{t}, s)

(2)

where P_t is the output of the pooling layer, max(⋅) is the down-sampling function of the maximum value, l_t is the feature vector of the convolutional layer, and s is the pooling size.

3.2.2. LSTM

After CNN extracts cross-phase weather features, LSTM processes them temporally to model the effect of phase-to-phase weather interactions on yield (for example, high temperature in early phases may reduce yield unless compensated by rainfall in later phases). The structure is represented by Equations (3)–(8) and is composed of four key components: a forget gate (f_t), an input gate (i_t), an output gate (o_t), and a memory cell (

{\tilde{C}}_{t}

). The memory cell retains values across time intervals, while the three gates regulate the flow of information into and out of the cell. At each time step, t, the cell receives input x_t and the hidden state h_t₋₁ from the previous time step (t − 1). The forget gate f_t, the input gate i_t, the output gate o_t, and the memory cell are calculated as follows:

f_{t} = σ (W_{f} \times [h_{t - 1}, x_{t}] + b_{f})

(3)

i_{t} = σ (W_{i} \times [h_{t - 1}, x_{t}] + b_{i})

(4)

o_{t} = σ (W_{o} \times [h_{t - 1}, x_{t}] + b_{o})

(5)

{\tilde{C}}_{t} = t a n h (W_{c} \times [h_{t - 1}, x_{t}] + b_{c})

(6)

where σ and tan h are the sigmoid and hyperbolic tangent activation functions, respectively. The weights and biases of the input gate, output gate, forget gate, and memory cell are denoted by W_i, W_o, W_f, and W_c and b_i, b_o, b_f, and b_c, respectively.

Then, the output cell state C_t and the hidden state h_t at time t can be calculated as follows:

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t}

(7)

h_{t} = o_{t} \times {t a n h (C}_{t})

(8)

3.2.3. AM

The attention mechanism (AM) employed in this paper follows the additive attention formulation. Given the sequence of LSTM hidden states H = [h₁, h₂, …, h_t] where t represents the number of time steps (phases), the attention scores are computed as follows (Equations (9)–(11)):

s_{i} = t a n h (W_{h} h_{i} + b_{h})

(9)

α_{i} = s o f t m a x (s_{i}) = \frac{{e x p (s}_{i})}{\sum_{i = 1}^{t} {e x p (s}_{i})}

(10)

c = \sum_{i = 1}^{t} α_{i} h_{i}

(11)

where h_i is the hidden state at time step i,

W_{h}

is the weight matrix, b_h is the bias vector. s_i is the intermediate attention representation for time step i. α_i is the attention weight assigned to time step i, which is usually expressed as a softmax function of s_i. c is the context vector representing the entire sequence.

The crop yield prediction model in the integrated architecture framework consists of several distinct layers, as illustrated in Figure 1. Data Input (dimension complexity: R^1×24): The input layer receives 24 meteorological variables collected across four distinct production management phases. This initial feature vector serves as the numerical representation of the crop’s cultivation environment. Before entering the CNN layer, the data are preprocessed: missing values are handled through linear interpolation or case removal, outliers are mitigated using the IQR method, and Z-score normalization is applied. CNN (dimension complexity: R^22×64): Using 64 convolutional filters (kernel size = 3), the model maps the raw features into a high-dimensional latent space. Although the sequence length is reduced from 24 to 22 because of valid padding, the feature depth increases to 64, thereby strengthening the extraction of local spatial correlations among environmental factors. MaxPooling1D (dimension complexity after pooling: R^11×64): The strongest activations are retained while irrelevant noise is suppressed, reducing the sequence length to 11 and producing refined phase-level features for temporal modeling. LSTM (dimension complexity: R^11×64): The LSTM layer employs the tanh activation function to capture long-term dependencies along the temporal axis, encoding the cumulative effects of environmental stress over time. Attention (dimension complexity: R⁶⁴): The attention mechanism calculates a weight coefficient αt for each time step and compresses the sequence into a context vector that represents the most critical characteristics of the entire growth cycle. At this stage, the temporal dimension is collapsed, leaving only the most informative 64-dimensional semantic features. Regression Prediction (dimension complexity: R¹): Finally, the data pass through two Dense layers integrated with dropout (rate = 0.2) to mitigate overfitting, and the model maps these refined features to a single scalar value, namely the predicted crop yield y_t. The hyperparameter optimization was conducted using a hierarchical search strategy. Initially, the training heuristics, specifically the learning rate (0.001), batch size (32), and optimizer (Adam), were determined through coarse-grained testing to ensure stable convergence. Subsequently, a grid search was performed on the architectural dimensions, including the number of CNN filters, LSTM hidden units, and attention dimensions (all optimized at 64). To prevent overfitting, dropout rates were evaluated between 0.1 and 0.5, with 0.2 identified as the optimal value. The CNN and Dense layers employ the Rectified Linear Unit (ReLU) activation function to capture complex spatial patterns while maintaining training efficiency. For the temporal modeling component, the LSTM layer uses the hyperbolic tangent (tanh) activation to process sequential dependencies within the [−1, 1] range, ensuring numerical stability. This stage-wise refinement ensures that each model component is appropriately scaled to the complexity of the meteorological and agronomic input data. The ranges explored for hyperparameters for each model are listed in Table 4.

4. Results

In order to prove the effectiveness of the proposed model, it is compared with long short-term memory (LSTM), convolutional neural network (CNN), convolutional neural network integrated with long short-term memory (CNN-LSTM), long short-term memory with attention mechanism (LSTM-AM), and XGBoost. All deep learning models share the same training configuration: 100 epochs, batch size of 32, learning rate of 0.001, mean absolute error (MAE) as the loss function, and the Adam optimizer. These hyperparameters were selected based on preliminary experiments and are consistent with commonly adopted settings in the crop yield prediction literature [3,17]. For the proposed CNN-LSTM-AM model specifically, the Conv1D layer uses 64 filters with kernel size 3, the LSTM layer contains 64 units with return_sequences enabled, and dropout rates of 0.2 are applied after both the attention and Dense layers to mitigate overfitting (Table 5 summarizes the hyperparameter settings of the deep learning baseline models). The hyperparameters for XGBoost were determined according to the minimum root mean square error (RMSE) criterion through grid-search-based experimental tuning. The explored ranges included the number of boosting rounds (n_estimators = 100, 500, 1000), the maximum depth of each tree (max_depth = 3, 4, 5, 6), the learning rate (learning_rate = 0.01, 0.05, 0.1, 0.2), the proportion of training samples randomly selected for each tree (subsample = 0.6, 0.8, 1.0), the proportion of features randomly selected for each tree (colsample_bytree = 0.6, 0.8, 1.0), and the minimum loss reduction required for a further partition on a leaf node (gamma = 0, 1, 5) [18,19]. The final XGBoost setting used 1000 boosting iterations, a maximum tree depth of 5, a learning rate of 0.1, subsample = 0.8, colsample_bytree = 0.8, and gamma = 0. All models are implemented in Python 3.11. All the experiments are carried out under the running environment of Intel i5-11400H 2.7 GHz, NVIDIA GeForce RTX 3060 GPU, 16 GB of RAM, and Windows 11. To build all models, 1714 data items, from November 2011 to July 2019, were partitioned in chronological order, with the first 70% (1200 data items) used for training, the middle 15% (257 data items) for validation, and the final 15% (257 data items) for testing. Root mean square error (RMSE, Equation (12)), mean absolute percentage error (MAPE, Equation (13)), and the coefficient of determination (R², Equation (14)) were used as the evaluation criteria of these models.

R M S E = \sqrt{\frac{1}{n} \sum {({\hat{y}}_{i} - y_{i})}^{2}}

(12)

n: the number of testing data items,

y_{i}

: the actual value,

{\hat{y}}_{i}

: the predicted value.

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{{\hat{y}}_{i} - y_{i}}{y_{i}}|

(13)

n: the number of testing data items,

y_{i}

: the actual value,

{\hat{y}}_{i}

: the predicted value.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(14)

n: the number of testing data items,

y_{i}

: the actual value,

{\hat{y}}_{i}

: the predicted value,

\bar{y}

: the average value.

Table 6 shows that the accuracy of the proposed model (RMSE = 1448.24; MAPE = 3.60%; R² = 0.98) is better than those of CNN (RMSE = 2919.18; MAPE = 8.41%; R² = 0.92), LSTM (RMSE = 2529.74; MAPE = 6.17%; R² = 0.94), CNN-LSTM (RMSE = 1516.44; MAPE = 4.21%; R² = 0.97), LSTM-AM (RMSE = 2284.64; MAPE = 6.05%; R² = 0.95), and XGBoost (RMSE = 3452.47; MAPE = 10.60%; R² = 0.87) in terms of RMSE, MAPE, and R².

Table 7 indicates that incorporating production management phases improves prediction performance. The model with production management phases achieved lower RMSE (1448.24 vs. 1734.49) and MAPE (3.60% vs. 5.38%), as well as a slightly higher R² (0.98 vs. 0.97), than the model without production management phases. These results suggest that aligning weather variables with specific agronomic phases enables the model to better capture phase-sensitive crop–weather interactions and thus improves yield prediction accuracy.

Table 8 shows that the accuracy of the proposed model (RMSE = 1448.24; MAPE = 3.60%; R² = 0.98) is better than those of the prediction model with the first three phase weather data (18 input variables in the first, second, and third phases are used and six input variables in the fourth phase are ignored) (RMSE = 2219.99; MAPE = 5.97%; R² = 0.96), the prediction model with the first two phase weather data (12 input variables in the first and second phases are used and 12 input variables in the third and fourth phases are ignored) (RMSE = 6110.22; MAPE = 18.76%; R² = 0.69), and the prediction model with the first phase weather data (six input variables in the first phase are used and 18 input variables in the second, third, and fourth phases are ignored) (RMSE = 14,975.91; MAPE = 45.98%; R² = 0.25).

5. Discussion

The proposed CNN-LSTM-AM model demonstrated strong predictive performance for Bok choy yield, achieving an RMSE of 1448.24 kg/ha, an MAPE of 3.60%, and an R² of 0.98. Compared with the baseline models, including CNN, LSTM, CNN-LSTM, LSTM-AM, and XGBoost, the proposed model consistently outperformed all alternatives across the three evaluation metrics (Table 6). In addition, the model incorporating production management phases outperformed the model without such phase information (Table 7), indicating that explicit alignment between weather variables and agronomic operations improves predictive accuracy. The results in Table 8 further show that prediction accuracy improved as later-phase weather information became available, suggesting that weather conditions closer to harvest provide particularly informative signals for yield estimation. Collectively, these findings demonstrate that integrating phase-specific meteorological information with CNN-based feature extraction, LSTM-based temporal modeling, and attention-based weighting constitutes an effective framework for crop-yield prediction. Such results are consistent with prior studies emphasizing the value of temporally structured environmental information for in-season agricultural forecasting [20,21].

As shown in Figure 2, the observed and predicted yields were closely aligned, and the residuals were randomly distributed around zero without an obvious systematic pattern. These results further support the predictive accuracy and reliability of the proposed CNN-LSTM-AM model.

To provide agronomic interpretability, attention weights were analyzed across phases and variables (Table 9). Phase 4 (fertilization II to harvest) received the highest overall attention weight (0.46), followed by Phase 3 (0.25), Phase 2 (0.16), and Phase 1 (0.13), highlighting the dominant contribution of late-stage weather conditions to yield prediction. At the variable level, daily average temperature received the highest cumulative attention weight (0.35), followed by daily accumulated total sky radiation (0.29), indicating that these variables contributed most strongly to the model’s decision process. These results suggest that the proposed model not only improves prediction accuracy but also captures agronomically meaningful patterns associated with late-stage environmental stress and yield formation.

Three representative production cycles were selected from the test set to illustrate the model’s behavior under different yield conditions. In the high-yield case, the prediction error was 1.76% (observed = 50,000 kg/ha; predicted = 49,120 kg/ha), whereas in the medium-yield case the error was 1.93% (observed = 34,515 kg/ha; predicted = 33,850 kg/ha). In the low-yield case, the error increased to 4.82% (observed = 22,371 kg/ha; predicted = 23,450 kg/ha), reflecting the greater difficulty of predicting abnormal stress conditions. This low-yield cycle was characterized by heat stress and heavy rainfall during Phase 4, and the attention mechanism likewise assigned the greatest importance to this stage, effectively signaling the yield reduction.

The reliability of the proposed model is further supported by the stability analysis. Multi-run experiments across five independent random seeds showed that CNN-LSTM-AM consistently achieved the lowest RMSE (1445.83 ± 42.15), the lowest MAPE (3.62% ± 0.12%), and the highest R² (0.982 ± 0.003), while also exhibiting low variability across runs. These results suggest that the proposed architecture is not only accurate but also robust to random initialization, thereby enhancing confidence in its practical applicability.

Table 10 shows a comparison of crop-yield prediction in different studies. Compared with existing studies, the proposed model proved to be an effective method. The MAPE (3.60%) of the proposed model is lower than those reported by Nevavuori et al. [22], Zhao et al. [23], Hara et al. [24], Son et al. [25], and Sun et al. [26], although it remains higher than that reported by Joshua et al. [27]. In terms of R², the proposed model performs favorably relative to most published results and is close to the best reported values. Although its RMSE appears less favorable in cross-study comparison, this metric should be interpreted cautiously because RMSE is scale-dependent and closely related to the absolute magnitude of crop yield. Since Bok choy yield is relatively high compared with crops such as rice, soybean, wheat, cotton, and oats, a larger RMSE is not unexpected [4]. Therefore, the model’s strong performance in terms of MAPE and R² provides more meaningful evidence of its predictive competitiveness than RMSE alone.

From an application perspective, accurate yield prediction has practical value across multiple stages of the agricultural supply chain. Early predictions based on partial phase weather data can support harvest scheduling, labor allocation, and transportation planning before harvest. As more late-phase weather information becomes available, prediction accuracy improves, enabling more precise operational adjustments. In addition, reliable yield forecasts may support contract farming and market planning by reducing the risks of oversupply and shortfall. Yield predictions may also inform subsequent input decisions, such as fertilization schedules, irrigation strategies, and planting density adjustments under recurring weather patterns. More broadly, improved yield forecasting may contribute to lower post-harvest losses by aligning packaging, cold-chain logistics, and distribution capacity with expected harvest volumes, thereby supporting more sustainable supply-chain management [28].

The sustainability implications of improved prediction accuracy can be further illustrated through scenario-based analysis. Under the baseline XGBoost model (MAPE = 10.60%), a cooperative managing 100 hectares of Bok choy with an average yield of 15,000 kg/ha would face an average forecast deviation of approximately 1590 kg/ha, corresponding to 159,000 kg across the farm. In contrast, the proposed CNN-LSTM-AM model (MAPE = 3.60%) reduces the average deviation to approximately 540 kg/ha, or 54,000 kg across the same area, representing a 66% reduction in forecast uncertainty. Such improvement could potentially contribute to more sustainable practices by reducing the mismatch between harvested volume and contracted demand, improving the efficiency of fertilizer and water inputs, and enhancing transportation and cold-chain planning. These estimates remain illustrative, and their actual magnitude would depend on specific production, market, and supply-chain conditions. Nevertheless, they suggest that accurate yield forecasting has the potential to support sustainability-oriented agricultural management, though empirical validation of these sustainability outcomes is needed in future studies.

A case-specific scenario modeling analysis using a low-yield anomaly (15.3% shortfall) further suggested potential benefits, including 12.0% logistical waste reduction, 9.2% economic stability improvement, and 8.3% fertilizer-efficiency gain under adverse environmental conditions [28]. It should be emphasized that these values are derived from scenario-based projections rather than direct empirical observations, and therefore should be interpreted with caution. Future field validation studies are recommended to substantiate these estimated sustainability benefits.

Several limitations of this study should be acknowledged. First, the model comparison did not include more recent architectures such as temporal convolutional networks (TCN) or Transformer-based models. Given the compact input structure (24 features across four time steps), these may not exhibit their full advantages here; however, future comparisons remain warranted. Second, the study used phase-level aggregated weather variables. Incorporating more detailed indicators, such as extreme temperature, rainfall intensity, or intra-phase variability, may reveal additional predictive information. Third, although the multi-seed experiments demonstrated stable performance, the analysis was based on a single chronological train–validation–test split; time-series cross-validation could provide stronger evidence of generalizability. Fourth, the model was validated on one crop (Bok choy) in one region (Yunlin County). Cross-crop and cross-region generalization remains to be demonstrated via transfer learning.

6. Conclusions

This paper proposes CNN-LSTM-AM, a novel hybrid deep learning model for crop yield prediction that uniquely integrates weather data with production management phases. Unlike conventional approaches that aggregate weather variables over the entire growing season, our method aligns meteorological features with specific agronomic operations (sowing, fertilization, and thinning), enabling the model to capture phase-sensitive crop–weather interactions. Developed and validated using eight years of real-world production records (1714 cycles) from an agricultural cooperative in Taiwan, the model achieves an RMSE of 1448.24 kg/ha, MAPE of 3.60%, and R² of 0.98—with RMSE reduced by 58% compared to XGBoost.

The experimental results reveal three key findings: (1) the proposed CNN-LSTM-AM consistently outperforms five baseline models, including LSTM, CNN, CNN-LSTM, LSTM-AM, and XGBoost; (2) phase-specific weather aggregation improves prediction accuracy by 16.5% compared to whole-season averaging; and (3) prediction accuracy improves progressively as harvest time approaches, providing actionable insights for supply chain planning at multiple decision points. These results demonstrate the critical importance of aligning weather data with crop growth stages for accurate yield forecasting.

Future research will explore advanced architectures such as Temporal Fusion Transformers (TFTs) to further enhance prediction accuracy and interpretability. Additionally, incorporating multi-modal sensing data, including soil moisture sensors and IoT-based microclimate monitors, may capture additional discriminative features for model refinement and cross-regional generalization.

Author Contributions

Conceptualization, S.-C.L.; Methodology, S.-C.L. and Y.-J.L.; Software, Y.-J.L.; Validation, Y.-J.L.; Formal analysis, Y.-J.L. and H.-Y.W.; Investigation, S.-C.L. and H.-Y.W.; Resources, S.-C.L.; Data curation, S.-C.L.; Writing—original draft, S.-C.L. and C.-H.C.; Writing—review and editing, C.-H.C.; Visualization, Y.-J.L. and C.-H.C.; Supervision, S.-C.L.; Project administration, S.-C.L.; Funding acquisition, S.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

National Science and Technology Council, Taiwan: 111-2637-E-020-004.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The production records used in this study were obtained from a fruit and vegetable cooperative in Yunlin County, Taiwan, and the meteorological data were sourced from the Central Weather Administration (CWA, formerly Central Weather Bureau) of Taiwan. Due to cooperative confidentiality agreements and privacy considerations, the raw production data are not publicly deposited in an open repository. However, the processed and anonymized dataset is available from the corresponding author upon reasonable request for academic research purposes. The meteorological data can be accessed through the CWA open data platform (https://opendata.cwa.gov.tw/, accessed on 11 March 2026).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Phase-Segmentation Pseudocode

Algorithm A1: Agricultural_Phase_Alignment_and_Preprocessing

Input:

- Weather_Data: Daily records of (Temp, Humidity, Pressure, Wind Speed, Rainfall, Solar Radiation)

- Production_Log: List of cycles with dates (Sowing, Fertilization I, Thinning, Fertilization II, Harvest)

Output:

- Final_Feature_Matrix: Normalized 24-feature vectors

1. FOR each Production_Cycle in Production_Log:

2. # Define Mutually Exclusive Intervals

3. Phase_Intervals = [

(Sowing, Fertilization I-1),

(Fertilization I, Thinning-1),

(Thinning, Fertilization II-1),

(Fertilization II, Harvest-1)]

4. # Quality Filtering: Missing Data Check

5. IF Any_Phase has Missing_Days >= 1:

6. IF Missing_Days == 1:

7. Apply Linear_Interpolation(Weather_Data)

8. ELSE:

9. Exclude(Production_Cycle)

10. CONTINUE

11. # Feature Aggregation per Phase

12. FOR each Phase(i) in Phase_Intervals:

13. # Mean-type: Temp, Humidity, Pressure, Wind Speed

14. M[i][Mean_Type] = Average(Weather_Data within Phase(i))

15. # Cumulative-type: Rainfall, Solar Radiation

16. M[i][Sum_Type] = Total_Sum(Weather_Data within Phase(i))

17. APPEND M to Dataset

18. END FOR

19. # Outlier Management (IQR Winsorization)

20. FOR each Feature in Dataset:

21. IQR = Q3 − Q1

22. Lower_Bound = Q1 − 1.5 × IQR

23. Upper_Bound = Q3 + 1.5 × IQR

24. CLIP(Feature) to [Lower_Bound, Upper_Bound]

25. END FOR

26. # Standardization

27. APPLY Z-score Normalization (X_std = (X − μ)/σ) to all features

28. RETURN Dataset

Appendix B. Preprocessing Ablation Study

Appendix B results: Full Model: RMSE = 1448.24, MAPE = 3.60%, R² = 0.980. Without Winsorization: RMSE = 2346.15, MAPE = 5.84%, R² = 0.941. Without Z-score: RMSE = 2819.23, MAPE = 7.12%, R² = 0.915. No Preprocessing: RMSE = 3965.86, MAPE = 10.25%, R² = 0.864.

References

Jabed, M.; Murad, M. Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and sustainability. Heliyon 2024, 10, e40836. [Google Scholar] [CrossRef] [PubMed]
Klompenburg, T.V.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Comput. Electron. Agric. 2020, 177, 105709. [Google Scholar] [CrossRef]
Khaki, S.; Wang, L.; Archontoulis, S. A CNN-RNN framework for crop yield prediction. Front. Plant Sci. 2019, 10, 1750. [Google Scholar] [CrossRef] [PubMed]
Oikonomidis, A.; Catal, C.; Kassahun, A. Hybrid deep learning-based models for crop yield prediction. Appl. Artif. Intell. 2022, 36, e2031823. [Google Scholar] [CrossRef]
Liu, F.; Jiang, X.; Wu, Z. Attention mechanism-combined LSTM for grain yield prediction in China using multi-source satellite imagery. Sustainability 2023, 15, 9210. [Google Scholar] [CrossRef]
Li, C.; Zhang, L.; Wu, X.; Chai, H.; Xiang, H.; Jiao, Y. Winter wheat yield estimation by fusing CNN-MALSTM deep learning with remote sensing indices. Agriculture 2024, 14, 1961. [Google Scholar] [CrossRef]
Lin, F.; Crawford, S.; Guillot, K.; Zhang, Y.; Chen, Y.; Yuan, X.; Chen, L.; Williams, S.; Minvielle, R.; Xiao, X.; et al. MMST-ViT: Climate change-aware crop yield prediction via multi-modal spatial-temporal vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 5774–5784. [Google Scholar]
Bi, L.; Wally, O.; Hu, G.; Tenuta, A.U.; Kandel, Y.R.; Mueller, D.S. A transformer-based approach for early prediction of soybean yield using time-series images. Front. Plant Sci. 2023, 14, 1173036. [Google Scholar] [CrossRef]
Junankar, T.; Sondhi, J.K.; Nair, A.M. Wheat yield prediction using temporal fusion transformers. In Proceedings of the 2nd International Conference for Innovation in Technology (INOCON), Bangalore, India, 3–5 March 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
Mohan, A.; Venkatesan, M.; Prabhavathy, P.; Jayakrishnan, A. Temporal convolutional network based rice crop yield prediction using multispectral satellite data. Infrared Phys. Technol. 2023, 135, 104960. [Google Scholar] [CrossRef]
Gong, L.; Yu, M.; Jiang, S.; Cutsuridis, V.; Pearson, S. Deep learning based prediction on greenhouse crop yield combined TCN and RNN. Sensors 2021, 21, 4537. [Google Scholar] [CrossRef]
Osibo, B.K.; Ma, T.; Bediako-Kyeremeh, B.; Mamelona, L.; Darbinian, K. TCNT: A temporal convolutional network-transformer framework for advanced crop yield prediction. J. Appl. Remote Sens. 2024, 18, 044513. [Google Scholar] [CrossRef]
Lu, J.; Li, J.; Fu, H.; Tang, X.; Liu, Z.; Chen, H.; Sun, Y.; Ning, X. Deep learning for multi-source data-driven crop yield prediction in northeast China. Agriculture 2024, 14, 794. [Google Scholar] [CrossRef]
Dong, C.; Peng, X.; Yang, X.; Wang, C.; Yuan, L.; Chen, G.; Tang, X.; Wang, W.; Wu, J.; Zhu, S.; et al. Physiological and transcriptomic responses of Bok Choy to heat stress. Plants 2024, 13, 1093. [Google Scholar] [CrossRef]
Kano, K.; Kitazawa, H.; Suzuki, K.; Widiastuti, A.; Odani, H.; Zhou, S.; Chinta, Y.; Eguchi, Y.; Shinohara, M.; Sato, T. Effects of organic fertilizer on Bok Choy growth and quality in hydroponic cultures. Agronomy 2021, 11, 491. [Google Scholar] [CrossRef]
Pan, J.; Peng, K.; Ruan, R.; Liu, Y.; Cui, X. Impact of anaerobic fermentation liquid on Bok Choy and mechanism of combined vitamin C from Bok Choy and Allicin in treatment of DSS colitis. Foods 2025, 14, 785. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-level soybean yield prediction using deep CNN-LSTM model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef] [PubMed]
Chen, T.Q.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Huber, F.; Yushchenko, A.; Stratmann, B.; Steinhage, V. Extreme gradient boosting for yield estimation compared with deep learning approaches. Comput. Electron. Agric. 2022, 202, 107346. [Google Scholar] [CrossRef]
Togliatti, K.; Archontoulis, S.V.; Dietzel, R.; Puntel, L.; VanLoocke, A. How does inclusion of weather forecasting impact in-season crop model predictions? Field Crops Res. 2017, 214, 261–272. [Google Scholar] [CrossRef]
Peng, D.; Cheng, E.; Feng, X.; Hu, J.; Lou, Z.; Zhang, H.; Zhao, B.; Lv, Y.; Peng, H.; Zhang, B. A deep-learning network for wheat yield prediction combining weather forecasts and remote sensing Data. Remote Sens. 2024, 16, 3613. [Google Scholar] [CrossRef]
Nevavuori, P.; Narra, N.; Lipping, T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019, 163, 104859. [Google Scholar] [CrossRef]
Zhao, Y.; Potgieter, A.; Zhang, M.; Wu, B.; Hammer, G. Predicting wheat yield at the field scale by combining high-resolution sentinel-2 satellite imagery and crop modelling. Remote Sens. 2020, 12, 1024. [Google Scholar] [CrossRef]
Hara, P.; Piekutowska, M.; Niedbała, G. Selection of independent variables for crop yield prediction using artificial neural network models with remote sensing data. Land 2021, 10, 609. [Google Scholar] [CrossRef]
Son, N.; Chen, C.; Cheng, Y.; Toscano, P.; Chen, C.; Chen, S.; Tseng, K.; Syu, C.; Guo, H.; Zhang, Y. Field-scale rice yield prediction from Sentinel-2 monthly image composites using machine learning algorithms. Ecol. Inform. 2022, 69, 101618. [Google Scholar] [CrossRef]
Sun, J.; Tian, P.; Li, Z.; Wang, X.; Zhang, H.; Chen, J.; Qian, Y. Construction and optimization of integrated yield prediction model based on phenotypic characteristics of rice grown in small–scale plantations. Agriculture 2025, 15, 181. [Google Scholar] [CrossRef]
Joshua, V.; Priyadharson, S.; Kannadasan, R. Exploration of machine learning approaches for paddy yield prediction in eastern part of Tamilnadu. Agronomy 2021, 11, 2068. [Google Scholar] [CrossRef]
Behzadi, G.; O’Sullivan, M.J.; Olsen, T.L.; Zhang, A. Agribusiness supply chain risk management: A review of quantitative decision models. Omega 2018, 79, 21–42. [Google Scholar] [CrossRef]

Figure 1. The integrated architecture framework of the proposed CNN-LSTM-AM model, illustrating the data flow and dimension complexity at each stage.

Figure 2. Model evaluation plots (N = 257). (a) Observed vs. predicted yield (parity plot) with the 1:1 line showing high alignment. (b) Residual plot showing a stochastic distribution without systematic bias.

Table 1. Some selected integrated data samples for model training, validation and testing.

Date	Temperature	Rainfall	Humidity	Accumulated Total Sky Radiation	Wind Speed	Atmospheric Pressure	Production Record	Yield in This Production Cycle (kg/ha)
11 June 2019	27.4	2	76.0	4.4	210.8	1001.2	Sowing	16,494.9
12 June 2019	25.8	0	84.8	2.5	126.7	1002.2
13 June 2019	28.0	8.5	79.0	12.1	175.4	997.7
14 June 2019	26.7	0	81.9	7.2	146.7	997.7
15 June 2019	27.9	0	70.0	22.9	122.1	1000.7
16 June 2019	27.5	0	74.6	17.1	133.3	1003.2
17 June 2019	28.0	0	78.8	15.8	162.1	1005.6	Fertilization I

Table 2. The input and output variables.

Input variables x_t

(1) Average temperature in the first phase

(2) Average temperature in the second phase

(3) Average temperature in the third phase

(4) Average temperature in the fourth phase

Accumulated rainfall in the first phase

Accumulated rainfall in the second phase

Accumulated rainfall in the third phase

Accumulated rainfall in the fourth phase

(5) Average humidity in the first phase

(6) Average humidity in the second phase

(7) Average humidity in the third phase

(8) Average humidity in the fourth phase

Accumulated total sky radiation in the first phase

Accumulated total sky radiation in the second phase

Accumulated total sky radiation in the third phase

Accumulated total sky radiation in the fourth phase

(9) Average wind speed in the first phase

(10) Average wind speed in the second phase

(11) Average wind speed in the third phase

(12) Average wind speed in the fourth phase

(13) Average atmospheric pressure in the first phase

(14) Average atmospheric pressure in the second phase

(15) Average atmospheric pressure in the third phase

(16) Average atmospheric pressure in the fourth phase

Output variable y_t

(1) Crop yield at production cycle t (kg/ha)

Table 3. The percentage of missing values by variable and phase.

Weather Variables	Phase 1	Phase 2	Phase 3	Phase 4
Temperature	0.03%	0.02%	0.01%	0.02%
Humidity	0.12%	0.08%	0.15%	0.10%
Wind speed	0.25%	0.30%	0.22%	0.28%
Atmospheric pressure	0.04%	0.03%	0.05%	0.02%
Rainfall	0.05%	0.02%	0.04%	0.03%
Accumulated total sky radiation	0.85%	0.72%	0.90%	0.88%

Table 4. The ranges for hyperparameters for the proposed model and other deep learning models.

Category	Hyperparameter	Search Space
Optimization	Learning Rate	10⁻², 10⁻³, 10⁻⁴
	Batch Size	16, 32, 64
	Optimizer	Adam, RMSprop, SGD
CNN	Filters/Kernel/Activation function	32, 64, 128/3, 5/ReLU, Tanh
LSTM/Attention	Units/Activation function	32, 64, 128/ReLU, Sigmoid, Tanh
Regularization	Dropout Rate	0.1–0.5

Table 5. The hyperparameter setting for deep learning baseline models.

Model	Hyperparameter
CNN	filters = 64/kernel sizes = 3/pool_size = 2/activation function = ReLU
LSTM	units = 64/activation function = tanh
CNN-LSTM	CNN: filters = 64/kernel sizes = 3/pool_size = 2/activation function = ReLU
CNN-LSTM	LSTM: units = 64/activation function = tanh
LSTM-AM	LSTM: units = 64/activation function = tanh
LSTM-AM	AM: 64, use attention = True

Table 6. Comparison of prediction models.

Model	RMSE	MAPE (%)	R²
The proposed model	1448.24	3.60	0.98
LSTM-AM	2284.64	6.05	0.95
CNN-LSTM	1516.44	4.21	0.97
LSTM	2529.74	6.17	0.94
CNN	2919.18	8.41	0.92
XGBoost	3452.47	10.60	0.87

Table 7. Comparison of prediction models with and without production management phases.

Model	RMSE	MAPE (%)	R²
The model with production management phases	1448.24	3.60	0.98
The model without production management phases	1734.49	5.38	0.97

Table 8. Comparison of prediction models with different-phase weather data.

Model	RMSE	MAPE (%)	R²
The model with 1–4 phase weather data	1448.24	3.60	0.98
The model with the first three phase weather data (1–3 phase)	2219.99	5.97	0.96
The model with the first two phase weather data (1–2 phase)	6110.22	18.76	0.69
The model with the first phase weather data (1–1 phase)	14,975.91	45.98	0.25

Table 9. Attention weights (averages) for the proposed model by phase and by variable.

Weather Variables	Phase 1	Phase 2	Phase 3	Phase 4	Variable Total
Daily average temperature	0.04	0.05	0.08	0.18	0.35
Daily average humidity	0.03	0.03	0.04	0.05	0.15
Daily average wind speed	0.01	0.01	0.01	0.01	0.04
Daily average atmospheric pressure	0.01	0.01	0.01	0.02	0.05
Daily accumulated rainfall	0.02	0.02	0.03	0.05	0.12
Daily accumulated total sky radiation	0.02	0.04	0.08	0.15	0.29
Phase total	0.13	0.16	0.25	0.46	1

Table 10. Comparison of crop yield prediction in different studies.

Source	Crop	Prediction Method	Input	Metric for Methods
This paper	Bok choy	CNN-LSTM-AM	Weather data associated with production management phases	RMSE = 1448.24 kg/ha, MAPE = 3.60%, R² = 0.98
Nevavuori et al. [22]	Wheat	CNN	UAV imagery (NDVI, RGB)	MAPE = 8.8% (early), 12.6% (late)
Zhao et al. [23]	Wheat	Sentinel-2 indices + crop model	Satellite indices (OSAVI, CI), crop water stress	R² = 0.91, RMSE = 0.54 t/ha, MAPE = 10–59%
Hara et al. [24]	Rapeseed	ANN (MLP)	Weather, soil, management	MAPE = 9.43%
Joshua et al. [27]	Paddy	GRNN, SVR, RBFNN, BPNN	Weather, soil, fertilizer, nutrients	GRNN: RMSE = 0.2295, MAPE = 1.34%, R² = 0.9863
Oikonomidis et al. [4]	Soybean	CNN-DNN (Hybrid Deep Learning)	Weather, soil (395 features)	RMSE = 0.266, MAE = 0.199, R² = 0.87
Son et al. [25]	Rice	SVM, RF, ANN	Sentinel-2 satellite imagery	SVM: MAPE = 3.5–9.4%, RMSPE = 4.7–11.2%
Sun et al. [26]	Rice	Stacking Ensemble (RF, SVM, MLP, etc.)	Phenotypic traits (panicle angle, length, etc.)	RMSE = 0.2483, MAPE = 6.90%, R² = 0.9250

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, S.-C.; Lin, Y.-J.; Chung, C.-H.; Wen, H.-Y. A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input. Sustainability 2026, 18, 3806. https://doi.org/10.3390/su18083806

AMA Style

Liu S-C, Lin Y-J, Chung C-H, Wen H-Y. A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input. Sustainability. 2026; 18(8):3806. https://doi.org/10.3390/su18083806

Chicago/Turabian Style

Liu, Shu-Chu, Yan-Jing Lin, Chih-Hung Chung, and Hsien-Yin Wen. 2026. "A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input" Sustainability 18, no. 8: 3806. https://doi.org/10.3390/su18083806

APA Style

Liu, S.-C., Lin, Y.-J., Chung, C.-H., & Wen, H.-Y. (2026). A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input. Sustainability, 18(8), 3806. https://doi.org/10.3390/su18083806

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Deep Learning Model for Crop Yield Prediction Taking Weather Data Associated with Production Management Phases as Input

Abstract

1. Introduction

2. Contribution

3. Data and Models

3.1. The Data for the Crop Yield Prediction Model

3.2. The Proposed Crop Yield Prediction Model CNN-LSTM-AM

3.2.1. CNN

The Convolution Layer

The Pooling Layer

3.2.2. LSTM

3.2.3. AM

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Phase-Segmentation Pseudocode

Appendix B. Preprocessing Ablation Study

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI