Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50

Wang, Donglin; Cheng, Yuhan; Shi, Longfei; Yin, Huiqing; Yang, Guangguang; Liu, Shaobo; Dong, Qinge; Ge, Jiankun

doi:10.3390/agronomy15071755

Open AccessArticle

Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50

by

Donglin Wang

^1,2,

Yuhan Cheng

¹,

Longfei Shi

¹,

Huiqing Yin

¹,

Guangguang Yang

³,

Shaobo Liu

^2,*

,

Qinge Dong

⁴ and

Jiankun Ge

⁵

¹

College of Water Conservancy, North China University of Water Resources and Electric Power, Zhengzhou 450045, China

²

School of Water Resources and Environment Engineering, Nanyang Normal University, Nanyang 473061, China

³

School of Computing, University of Portsmouth, Portsmouth PO1 3HE, UK

⁴

Institute of Water-Saving Agriculture in Arid Areas of China (IWSA), Northwest A&F University, Yangling 712100, China

⁵

Henan Key Laboratory of Water-Saving Agriculture, Zhengzhou 450045, China

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(7), 1755; https://doi.org/10.3390/agronomy15071755

Submission received: 29 May 2025 / Revised: 30 June 2025 / Accepted: 18 July 2025 / Published: 21 July 2025

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Winter wheat yield prediction is critical for optimizing field management plans and guiding agricultural production. To address the limitations of conventional manual yield estimation methods, including low efficiency and poor interpretability, this study innovatively proposes an intelligent yield estimation method based on a convolutional neural network (CNN). A comprehensive two-factor (fertilization × irrigation) controlled field experiment was designed to thoroughly validate the applicability and effectiveness of this method. The experimental design comprised two irrigation treatments, sufficient irrigation (C) at 750 m³ ha⁻¹ and deficit irrigation (M) at 450 m³ ha⁻¹, along with five fertilization treatments (at a rate of 180 kg N ha⁻¹): (1) organic fertilizer alone, (2) organic–inorganic fertilizer blend at a 7:3 ratio, (3) organic–inorganic fertilizer blend at a 3:7 ratio, (4) inorganic fertilizer alone, and (5) no fertilizer control. The experimental protocol employed a DJI M300 RTK unmanned aerial vehicle (UAV) equipped with a multispectral sensor to systematically acquire high-resolution growth imagery of winter wheat across critical phenological stages, from heading to maturity. The acquired multispectral imagery was meticulously annotated using the Labelme professional annotation tool to construct a comprehensive experimental dataset comprising over 2000 labeled images. These annotated data were subsequently employed to train an enhanced CNN model based on ResNet50 architecture, which achieved automated generation of panicle density maps and precise panicle counting, thereby realizing yield prediction. Field experimental results demonstrated significant yield variations among fertilization treatments under sufficient irrigation, with the 3:7 organic–inorganic blend achieving the highest actual yield (9363.38 ± 468.17 kg ha⁻¹) significantly outperforming other treatments (p < 0.05), confirming the synergistic effects of optimized nitrogen and water management. The enhanced CNN model exhibited superior performance, with an average accuracy of 89.0–92.1%, representing a 3.0% improvement over YOLOv8. Notably, model accuracy showed significant correlation with yield levels (p < 0.05), suggesting more distinct panicle morphological features in high-yield plots that facilitated model identification. The CNN’s yield predictions demonstrated strong agreement with the measured values, maintaining mean relative errors below 10%. Particularly outstanding performance was observed for the organic fertilizer with full irrigation (5.5% error) and the 7:3 organic-inorganic blend with sufficient irrigation (8.0% error), indicating that the CNN network is more suitable for these management regimes. These findings provide a robust technical foundation for precision farming applications in winter wheat production. Future research will focus on integrating this technology into smart agricultural management systems to enable real-time, data-driven decision making at the farm scale.

Keywords:

water–nitrogen coupling; CNN; wheat spike detection; yield estimation; interpretable accuracy

1. Introduction

1.1. Challenges of Water–Fertilizer Management in the Context of Food Security

Wheat (Triticum aestivum L.), as China’s second most crucial food crop, plays a pivotal role in safeguarding national food security due to its direct impact on grain supply stability [1]. Henan Province contributes over 25% of China’s total winter wheat yield [2], despite its relatively low annual precipitation (500–700 mm) and significant seasonal temperature fluctuations (±15 °C monthly variation); these conditions underpin consistent food production [3]. However, the current situation in the North China Plain is characterized by severe groundwater over-exploitation and low nitrogen use efficiency (NUE) [4,5]. Research has demonstrated that moderate water deficit (60% of field capacity, FC) combined with 20% nitrogen reduction can increase water use efficiency (WUE) [6]. Furthermore, delayed nitrogen application technology (base-dressing ratio of 4:6) has been shown to enhance photosynthetic rate and dry matter accumulation during the grain-filling stage [7,8]. The conventional “high-input, high-water” extensive management not only results in significant resource wastage but also leads to severe soil degradation and non-point source pollution [9]. Consequently, achieving synergistic water–fertilizer efficiency has emerged as a critical research priority in modern agricultural engineering.

1.2. Comparative Analysis of Major Crop Yield Prediction Methods

Establishing a high-precision wheat yield prediction system holds significant strategic importance for ensuring national food security and promoting sustainable global agricultural development [10,11]. Accurate and timely prediction of winter wheat yield facilitates the regulation of grain market prices and provides scientific support for agricultural production planning, national economic strategies, and macro-level policy decisions [12,13,14]. Current mainstream crop yield prediction methodologies are primarily categorized into three approaches: manual field surveys, crop growth models, and machine learning algorithms [15]. Among these, traditional field sampling methods exhibit significant limitations, including insufficient survey accuracy, low operational efficiency, and inadequate responsiveness to dynamic crop growth variations. To enhance prediction accuracy, researchers have developed multi-source data fusion approaches that integrate remote sensing data with crop growth models (e.g., DSSAT, APSIM) through model-data assimilation techniques, significantly improving regional-scale estimation precision [16,17,18]. The application of sensitivity analysis methods (Morris, FAST, Sobol, EFAST) for key parameter screening has proven effective in boosting model efficiency. As demonstrated by Wang et al. (2022), multi-temporal UAV remote sensing imagery enabled precise identification of winter wheat’s critical growth stages [19]. This approach facilitated the construction of a yield estimation model that optimized both growth stage selection and algorithmic performance.

The advancement of object detection technologies has created new opportunities for crop phenotyping analysis. In recent years, the integration of machine learning methods with remote sensing data has been widely adopted in crop yield estimation research, while deep learning-based object detection techniques have also achieved significant breakthroughs [20]. In the evolution of crop yield prediction algorithms, deep learning-based object detection methods have achieved remarkable breakthroughs and widespread agricultural applications in recent years: Faster R-CNN, leveraging its region proposal network (RPN) mechanism, demonstrates superior accuracy in fine-grained recognition tasks such as wheat head counting [21]. YOLOv8 achieves real-time processing at 45 FPS, making it ideal for the dynamic field monitoring of crop diseases and pests [22]. The YOLO series has continuously improved efficiency through architectural iterations [23], while Mask R-CNN excels in instance segmentation tasks like grain counting per spike due to its added segmentation branch [24]. Recent studies have made significant progress in crop phenotyping detection algorithms. Wang et al. (2019) and Shelhamer et al. (2016) developed wheat head detection methods combining corner detection with fully convolutional layers [24,25], while Zhang et al. (2019) improved recognition accuracy by integrating convolutional neural networks with non-maximum suppression [26], which is consistent with the findings of Neubeck et al. (2006) [27]. Patel et al. (2023) applied CNN models for eggplant weed detection [28], and Chen et al. (2024) enhanced YOLOv7’s feature extraction network by incorporating switchable atrous convolution in the MP module, enabling multi-scale wheat head detection and effectively addressing occlusion challenges [29]. However, current models exhibit limited generalization across regions [27] and require extensive annotated samples [28], often leading to reduced accuracy. To address these limitations, recent work has introduced evolving convolution (EVC) incremental learning modules at feature fusion points (FFP), reducing parameters while improving model generalization [29,30,31]. Comparative studies of ResNet-18, YOLOv3, CenterNet, and Faster R-CNN show that CenterNet achieves the highest mean average precision (0.88), though wheat head recognition accuracy still requires improvement. While continuous optimization of these algorithms has significantly improved detection performance in agricultural applications, challenges remain in model generalization and computational efficiency. Future research directions include combining hyperspectral and LiDAR data with deep learning and coupling these approaches with crop physiological models to enhance performance and applicability.

1.3. Development and Innovative Applications of CNN in Crop Yield Prediction

Convolutional neural networks (CNNs), as a type of deep feedforward neural network [32], demonstrate significant potential in yield prediction by effectively capturing local features of input data through localized receptive fields. Their technological evolution can be divided into three distinct phases. Early research primarily focused on processing single data sources. For instance, Milioto et al. (2018) developed an encoder–decoder CNN architecture based on vegetation indices, achieving real-time crop type identification [33]. During the temporal feature extraction phase, Zhou et al. (2019) employed CNNs to directly extract features from time-series imagery, circumventing the limitations of manual feature engineering, and subsequently used fully connected networks to predict winter wheat yields in northern China, with promising results [34]. However, such models often struggled to integrate key agronomic characteristics like spatial heterogeneity. During the multi-modal fusion phase, Liu et al. (2022) innovatively combined CNNs with Gaussian process regression (GPR), incorporating SENet channel attention mechanisms to weight features and fusing spatial location information, which significantly improved the accuracy of wolfberry yield predictions [35]. These models have been widely used in various fields such as remote sensing and agriculture. These CNN advancements complement the object detection methods discussed earlier (e.g., YOLOv7 for wheat heads), forming a comprehensive toolkit for crop monitoring from organ-level to field-scale. Although CNN models exhibit notable advantages in detection accuracy (average improvement of 35%) and computational efficiency (inference speeds reaching 28 FPS), challenges remain in terms of small-sample generalization and cross-regional adaptability. These groundbreaking advancements signify that agricultural monitoring has transitioned from traditional empirical judgment to a new era of intelligent decision making, providing precision agriculture with a multi-scale analytical toolkit spanning from organ-level to regional-scale observations. Recent studies indicate that the integration of CNNs with process-based models (e.g., CNN-DSSAT hybrid systems) may represent a critical pathway for addressing the persistent challenge of model interpretability in agricultural AI applications.

This study proposes an improved ResNet50-based CNN approach for evaluating water–nitrogen coupling effects in winter wheat, aiming to achieve end-to-end precision monitoring from spike identification to yield prediction through deep learning. The methodology comprises three key components: (1) construction of 10 water–nitrogen treatment combinations (2 irrigation levels × 5 nitrogen gradients) and training on 2000 manually annotated spike images to generate canopy spike density maps; (2) integration of spike count features with multispectral data and environmental sensor inputs to develop yield prediction models, with particular focus on canopy response mechanisms to water-nitrogen regulation during stem elongation and grain filling stages; and (3) comparative analysis between CNN-predicted and field-measured yields, employing dual evaluation metrics of water use efficiency (WUE) and nitrogen use efficiency (NUE) to identify optimal water–nitrogen management strategies. The proposed system enhances resource use efficiency while maintaining yield stability, providing technical support for sustainable winter wheat production and green agricultural development in the North China Plain.

2. Materials and Methods

2.1. Field Experimental Condition and Design

This study conducted field experiments from October 2022 to May 2023 at the Longzihu Campus of the North China University of Water Resources and Electric Power (34.78° N, 113.76° E, altitude 110 m) in Zhengzhou City, Henan Province. The experimental site is located between the Yellow River and Huai River basins (112°42′–114°14′ E, 34°16′–34°58′ N), characterized by a typical northern temperate continental monsoon climate, with an average annual temperature of 14.5 °C (regional range: 13–18 °C) and average annual precipitation of 637.1 mm (regional range: 542–1100 mm), of which 70% occurs from June to September. The average daily sunshine duration is 6.57 h, with a frost-free period of 220 days.

The experimental field features flat terrain with clay loam soil. The composition of the 0–60 cm soil layer is as follows: organic matter: 870 mg/kg; available potassium: 104.4 mg/kg; available phosphorus: 11.8 mg/kg; total nitrogen: 539 mg/kg; available nitrogen: 45–60 mg/kg. Detailed physicochemical properties are shown in Table 1. The experiment employed a two-factor (irrigation × fertilization) design with 10 treatment groups (Table 2). Among them, the compound fertilizer used in the experiment was produced by Jiangxi Kaimenzi Fertilizer Industry Co., LTD, Jingdezhen, China. The organic fertilizer selected was a highly active organic fertilizer supported by the Institute of Biology of Beijing Academy of Agricultural Sciences, China, with an organic matter content of ≥60% and uniform contents of nitrogen, phosphorus and potassium ≥6%. Each experimental plot measured 4 m × 3 m. Winter wheat, as the primary crop in the study area, was sown from late September to early October 2022, entered the overwintering stage from late November to early December, reached the heading stage from late April to early May 2023, and was harvested from late May to early June (see Figure 1 for the study area overview).

2.2. Construction of Experimental Remote Sensing Image Dataset

This study cultivated the semi-winter medium-maturing wheat cultivar ‘Jimai 22’ (Hebei Letu Seed Industry Co., LTD, Shijiazhuang, China) in October 2023 and manually recorded key agronomic parameters, including effective panicle number, grains per spike, 1000-grain weight, theoretical yield, and actual yield from 10 experimental plots during harvest in June 2024. A DJI M300 RTK (Shenzhen DJI Innovation Technology Co., LTD, Shenzhen, China) drone equipped with a multispectral sensor system was employed to acquire high-resolution growth imagery during critical phenological stages, from heading to maturity. The UAV was operated at an altitude of 30 m (achieving a ground resolution of 0.5 cm/pixel), with an 80% forward overlap and a flight speed of 5 m/s. To ensure image data consistency, a four-red-line marking method was implemented to delineate uniform sampling areas (Figure 2a). Between April 10 and May 29, 2024, a total of 2000 images (4096 × 3072 pixel resolution) were systematically captured during clear-sky midday periods (12:00–14:00) across four growth stages—stem elongation, heading, grain filling, and maturity—serving as the CNN model training dataset. Supervised learning was adopted for spike identification, with all 2000 training images manually annotated using the LabelMe annotation tool (http://labelme.csail.mit.edu/Release3.0/, accessed on 10 May 2024) to mark wheat spikes (Figure 2b). Labelme is a Python-based image annotation tool. We are using LabelMe version 3.16.7.

By extracting feature information from 10 remote sensing images of the whole wheat growth cycle in 2023–2024 in experimental fields treated with different quantities of water and nitrogen, a target bounding box and classification were generated, and a regressor was used to correct the position of the objects. In total, 3400 data samples were obtained. The experiment took 3400 samples from 2023 to 2024 as a training set and 2000 samples from 10 April 2024 to 29 May 2024 as a test set, and trained a wheat ear recognition model. The ratio of the training set to test set was 17:10.

2.3. Network Structure Design

2.3.1. Experimental Image Dataset

The dataset constructed in this study comprises 2000 high-quality images containing over 14,000 uniquely annotated wheat spike samples, with each image accompanied by precise bounding box annotations. This comprehensive dataset captures diverse field conditions throughout winter wheat growth and development: (1) multiple maturity stages (from heading to maturity), (2) varied phenotypic characteristics (including color and grain orientation), (3) five fertilization treatments (organic fertilizer alone, organic–inorganic fertilizer 7:3 ratio, organic–inorganic fertilizer 3:7 ratio, chemical fertilizer alone, and no fertilizer), (4) two irrigation modes (full irrigation LC1-LC5), and (5) different cultivars and planting densities. The dataset was systematically divided into a training set (3400 images) and a test set (2000 images), all acquired from rigorously controlled field experiments. Utilizing this dataset, we successfully applied convolutional neural networks to wheat image analysis, achieving accurate spike number estimation through the generation of high-precision density maps.

2.3.2. Faster-RCNN Network Structure

This study employs an improved Faster-RCNN object detection algorithm for winter wheat spike identification [21]. The algorithm consists of three core components: (1) a ResNet50-based feature extraction network (replacing the original VGG16 architecture [36,37]) that maintains a high recognition accuracy of more than 80%, while significantly reducing computational complexity; (2) a region proposal network (RPN) that generates candidate regions through a sliding window mechanism (using scalable n × n windows with 128 × 128 as the optimal size); and (3) a classification–regression module for precise localization and categorization of candidate regions. The algorithm first applies a selective search for image region segmentation, calculates region similarity based on grayscale and texture features, and iteratively merges the most similar regions to construct candidate region sets. Subsequently, a spatial downsampling layer is added after the convolutional layer to facilitate multi-scale feature extraction for targets of varying sizes. Experimental results demonstrate that this improved model outperforms the YOLO series models (YOLOv3 and YOLOv5) in conventional-density spike detection, but shows approximately 15% recognition error in high-density occlusion scenarios (with inter-spike overlap >40%). To address this, the study established strict data screening criteria to exclude heavily occluded image samples, ensuring training data quality. This improved Faster-RCNN-based detection method provides reliable technical support for winter wheat yield prediction, and its end-to-end architecture design facilitates deployment on lightweight hardware platforms.

The CNN network training effect is shown in Figure 3. The model gradually converges until the number of iterations is 20, and the training set and test set losses reach a minimum and tend to stabilize. This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. To explore more accurate ear detection methods, we compare YOLOv8 and CNN network model performance. The experimental processor is an Intel (R) Core (TM) i7-9700 CPU (NVIDIA Corporation, NASDAQ, Santa Clara, CA, USA), the software environment is based on Python 3.8.19 and Torch 2.2.0, and the setup includes an Nvidia GeForce RTX 2070 Super GPU (Santa Clara, CA, USA). Both training methods are batch training. Each batch randomly selects 4 samples to train the model, and the SGD optimizer is used for optimization. Each model training is trained for 100 epochs.

2.3.3. Improved CNN Based on ResNet50 Feature Extraction Network

This study employs an improved ResNet-50 as the backbone network architecture, which significantly enhances feature extraction performance through its unique residual learning mechanism [38]. As a deep convolutional neural network, ResNet-50’s 50-layer architecture achieves efficient feature extraction via the following key technologies: (1) residual block design, where each module contains 2–3 layers of 3 × 3 convolutions to strengthen local feature extraction capability; (2) cross-layer skip connections that directly transmit input signals to the output end, effectively addressing gradient vanishing/explosion issues in deep network training [39,40]; and (3) multi-scale pooling strategy combining max pooling (for dimensionality reduction) and average pooling (for spatial dimension compression) to optimize feature representation.

Compared to traditional networks, ResNet-50 significantly improves wheat spike feature extraction capability through its residual learning mechanism [41], while maintaining relatively few parameters. This approach not only prevents network degradation but also enables deeper feature learning, successfully overcoming limitations of the original network such as slow inference speed and insufficient recognition accuracy.

As shown in Figure 4, the ResNet50 model architecture consists of multiple convolutional layers, batch normalization layers, activation functions, pooling layers, and fully connected layers. The network can be formally represented as an L-layer structure, where the output of each layer is denoted as X_l (l = 1, 2, …, L). The feature transformation process at each layer can be described by Equation (1), with specific layer transformation operations defined by Equation (2). During spatial feature extraction, particular attention should be paid to the feature map F generated by the final output layer L (as shown in Equation (3)), which contains the key spatial feature information extracted through deep convolutional processing of the input image. This hierarchical structure design enables ResNet50 to effectively learn multi-level feature representations from image data.

X_{l} = X_{l} (X_{l - 1}; X_{1})

(1)

\{\begin{matrix} X_{1} = f_{1} (X_{0}; W_{1}) = R E L U (B a t c h N o r m (C o n v (X_{0}; W_{1}))) \\ X_{2} = f_{2} (X_{1}; W_{2}) = R E L U (B a t c h N o r m (C o n v (X_{1}; W_{2}))) \\ ⋮ \\ X_{L} = f_{L} (X_{L - 1}; W_{L}) = R E L U (B a t c h N o r m (C o n v (X_{L - 1}; W_{L}))) \end{matrix}

(2)

where f_L () is the transformation function of the lth layer, including the convolution layer, nonlinear activation function, and pooling layer; W_L is the weight parameter of the lth layer, and X₀ = Irgb; RELU ( ) is the nonlinear activation function, which is used to introduce nonlinear features; and BatchNorm () is the regularization function, which is used to accelerate training and improve stability.

F = X_{L} = F_{L} (F_{L - 1} (\dots F_{1} (W_{1}) \dots; W_{L - 1}); W_{L})

(3)

2.3.4. Non-Maximum Suppression Algorithm (NMS)

The non-maximum suppression (NMS) algorithm is employed to filter candidate frames, eliminating highly overlapping redundant frames while retaining only the most representative candidates. The region proposal network (RPN) filters the regions of interest (ROIs) passed to the detection component of Faster R-CNN, where the RPN shares feature maps with Faster R-CNN’s detection module, thereby facilitating more efficient ROI generation. Here, a pooling operation is performed on the selected ROI candidate regions, dividing each mapped ROI area into blocks of the same size as LCke, using max-pooling to adjust bounding box dimensions for each region while precisely extracting fixed-size features from the feature maps through accurate spatial alignment. This approach enables accurate feature extraction, even for candidate boxes at different scales. The end-to-end training jointly optimizes both the RPN and object detection tasks, allowing the network to detect targets efficiently and accurately in multi-scale, complex background scenarios. The network architecture is illustrated in Figure 5.

2.4. Main Evaluation Indicators of Winter Wheat Yield Prediction Model

In this experiment, the coefficient of determination (R²) and root mean square error (RMSE) were adopted to evaluate and analyze the model performance, as these two metrics represent the most effective accuracy assessment methods for the model. R² directly reflects the goodness-of-fit between the predicted and actual values, with its value ranging between 0 and 1. When R² approaches 1, it indicates better model fitting and superior prediction results. Meanwhile, RMSE is directly used to measure the average deviation between predicted and actual values—smaller RMSE values signify lower overall prediction errors and correspondingly higher prediction accuracy. Therefore, when evaluating yield estimation models, a high-precision model typically demonstrates an R² value close to 1 and the smallest possible RMSE value. The joint application of these two metrics for model evaluation not only demonstrates the model’s excellent fitting capability and high-precision prediction performance, but also enables effective identification and selection of the optimal yield prediction model for specific scenarios through comparative analysis of these values across different models. The calculation formulas are as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} (\overset{\land}{y_{i}} - y_{i})}{\sum_{i = 1}^{n} (\bar{y_{i}} - y_{i})}

(4)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(\overset{\land}{y_{i}} - y_{i})}^{2}}{n}}

(5)

In the formula,

y_{i}

is the predicted value,

\overset{\land}{y_{i}}

is the true value,

\bar{y_{i}}

is the average of the predicted values, and n is the sample size.

3. Results

3.1. CNN and YOLO’s Performance Comparison

As shown in Table 3 and Figure 6, the CNN network exhibits significantly higher GFLOPs compared to YOLOv8. GFLOPs stands for giga floating point operations per second, which measures the performance of hardware. It indicates the number of billion floating-point operations a system can perform in one second, making it a crucial metric for evaluating the efficiency of CPUs and GPUs, especially in applications like graphics processing and deep learning. Higher GFLOP values suggest better performance and capability to handle complex computations. In deep learning, for instance, models with higher GFLOPs can process more data and perform more calculations, impacting training and inference times. In this study, the 143% improvement in GFLOP metrics of the CNN model compared to YOLOv8 shows a significant correlation with its 2.1% enhancement in detection accuracy for wheat ear recognition tasks.

The average accuracy mean (mAP) of the CNN network is greater than YOLO network, indicating that CNN model detection performance is better than YOLO network, more suitable for wheat ear detection field; in terms of recognition accuracy, CNN network accuracy is up to 92.1%, 3.0% higher than YOLO network. The above results show that the two-stage network of the CNN model can improve detection accuracy by generating candidate regions and performing fine classification and regression on them, so that the generated wheat density map and the identified number of wheat ears are more accurate and can better provide more accurate data for subsequent yield estimation.

3.2. Influence of Different Water and Nitrogen Treatments on the Yield of Winter Wheat

Ten groups of experimental control groups were composed of two irrigation methods: full irrigation and partial irrigation, and five fertilization methods: organic fertilizer, organic fertilizer and inorganic fertilizer ratio 7:3, organic fertilizer and inorganic fertilizer ratio 3:7, and chemical fertilizer and no fertilizer. In the same growth environment, we measured the physical and chemical properties of each layer of soil in the experimental field. After a year of experiments, we obtained the number of grains per panicle, the number of effective panicles, the 1000-grain weight, the theoretical yield, and the actual yield of each experimental group through manual counting and weighing methods, as shown in Table 4.

We divided the ten control experiments into full irrigation and non-full irrigation and analyzed the actual data of wheat. This study employed a two-factor experimental design (irrigation×fertilization) to systematically analyze the effects of different treatment combinations on winter wheat yield. The results demonstrate that (1) under identical irrigation regimes, scientific fertilization treatments (LM2-LM4 and LC2-LC4) achieved significantly higher actual yields than other groups; (2) the non-fertilization control groups (LC5 and LM5) showed the poorest yield performance, conclusively verifying the yield-increasing effect of proper fertilization; (3) under deficit irrigation conditions, the average effective spike number and actual yield were 12.05% and 10.99% higher, respectively compared to full irrigation; notably, the 7:3 organic–inorganic fertilizer ratio under deficit irrigation (LM3) yielded 4.78% more than its full irrigation counterpart (LC3). These findings not only confirm the decisive role of fertilization in winter wheat production but also elucidate the yield advantages of optimized organic-inorganic fertilizer combinations, providing crucial theoretical support for field water-fertilizer management practices.

To sum up, when climatic conditions such as illumination and precipitation are consistent, the water–fertilizer coupling planting mode formed by insufficient irrigation and application of a certain proportion of organic fertilizer and inorganic fertilizer can significantly improve the yield of winter wheat and develop water-saving agriculture.

3.3. CNN Analysis of Production Estimation Accuracy

During the field experiment, a DJI M300 RTK unmanned aerial vehicle (UAV) equipped with a multispectral sensor was employed to systematically acquire high-resolution growth imagery of winter wheat across critical phenological stages, from heading to maturity (Figure 7).

The experimental protocol: The collected image data were processed using CNN network recognition technology to obtain the number of wheat ears identified in each image N’ and to output the density map of the wheat ears, as shown in Figure 8.

This study validated an improved CNN model for wheat ear counting and yield prediction across 10 differently managed winter wheat plots. The performance evaluation revealed the differences in the panicle number identified by the CNN network of experimental groups (Figure 9a): unfertilized plots (LC5, LM5) showed optimal recognition accuracy (error <10%); conventionally fertilized plots (LC1, LC2) maintained good performance (error <15%); and other treatments exhibited relatively higher errors (15–25%).

According to the number of panicles identified by the CNN, the yield of each group of winter wheat is estimated, and the calculation formula is

Y i e l d_{i d e n} = \frac{n \times N \times G \times 10}{1000}

(6)

where Yield_iden is the identification yield, kg ha⁻¹; n is the mean grains per spike; N is the effective panicles; and G is the thousand seed weight.

According to the above Formula (4), the identification yield of ten groups of experimental fields was calculated, and the error between the identification yield and actual yield was quantified, the identification yield of ten groups of experimental fields was calculated, and the error between the identification yield and actual yield was quantified. The data obtained are shown in Figure 9b.

Analysis of Figure 9 demonstrates that the CNN-identified panicle number showed closer alignment with ground-truth measurements for unfertilized plots (LC5 and LM5), and yield predictions achieved >90% accuracy relative to actual yields. These unfertilized treatments were excluded from subsequent analysis due to their agronomic irrelevance in practical production scenarios. LC1 and LC2 exhibited superior yield prediction performance in fertilized plots, and the estimation errors remained below 15%.

As shown in Figure 10, this study compares the actual yields of winter wheat with CNN model-predicted yields under different water–fertilizer treatments during 2022−2023. The results demonstrate that the yield variation trends predicted by the CNN model show high consistency with measured values, particularly in the organic−inorganic fertilizer combination (3:7) and chemical fertilizer alone treatment groups, where both exhibited the highest yield levels and minimal yield fluctuations. Under identical fertilization conditions, the yield from deficit irrigation showed no significant difference compared to full irrigation (p > 0.05), while under optimized water-fertilizer coupling (deficit irrigation+3:7 fertilization), yields demonstrated a slight increasing trend of 4.8%.

4. Discussion

4.1. Improving Wheat Ear Recognition Accuracy Using Modified CNN

Addressing the critical challenge of wheat head occlusion in dense canopies, this study conducts a systematic evaluation of the advantages and limitations of existing solutions. Early studies employed traditional digital image processing methods [42,43,44], such as the kernel segmentation algorithm based on inertial equivalent ellipses proposed by Visen et al. (2001) [45], which used pit detection and nearest-neighbor criteria to segment the occluded kernels and characterized the grains as isolated kernels or a group of adhered kernels. While improving detection efficiency, these methods showed limited adaptability. Although the dual-side recognition scheme by Cao et al. (2017) reduced error rates, its generalization capability under complex field conditions remained insufficient [46]. Our study innovatively implemented the following optimization strategies: (1) an improved non-maximum suppression (NMS) algorithm with a 0.4 IoU threshold for more effective redundant detection frame filtering; (2) upgrading the baseline feature extraction network from VGG16 to ResNet50 to enhance feature representation while maintaining real-time performance; and (3) incorporating transfer learning strategies using ImageNet pre-trained parameters to improve model generalization.

These improvements methodologically complemented the ResNet18 optimization scheme by Li et al. (2021) [47] and VGG16 application research by Zhu et al. (2020) [48], collectively advancing wheat ear detection accuracy. Li et al. (2021) developed an improved ResNet18-based model for estimating winter wheat seedling growth parameters and a Faster R-CNN with an NMS-based model for wheat ear counting [47]. Their models demonstrated significantly enhanced accuracy in detecting occluded wheat ears compared to conventional Faster R-CNN architectures, a finding that aligns with our experimental results. Complementing this work, Zhu et al. (2020) focused specifically on wheat grain detection using CNN networks [48], further validating the effectiveness of deep learning approaches in cereal crop phenotyping. The above research results showed that the recognition accuracy of convolutional neural network VGG16 was significantly better than that of traditional SVM and BP neural networks. The results are consistent with our research results of using VGG16 as a target feature extraction network to improve recognition accuracy. Notably, the accuracy of winter wheat yield estimation is not only directly determined by the precision of model recognition algorithms but is also significantly influenced by multiple external factors. Key challenges include (1) diminished color contrast between wheat spikes and soil background during the maturity stage, which substantially increases image recognition difficulty, and (2) mutual occlusion of spikes caused by suboptimal camera angles—both factors contributing to deviations between predicted and actual yields. To ensure field applicability, the grain-filling stage (BBCH 75–85) was selected as the ideal observation period, when spike morphological features are most distinct and background contrast remains optimal. In addition, UAV oblique imaging at 45° reduces inter-spike occlusion, enabling model predictions to achieve lower relative error. These field-validated practices provide reproducible methodological references for future research, particularly in standardizing aerial phenotyping data acquisition for cereal crops.

4.2. Effects of Water–Nitrogen Coupling on Winter Wheat Yield

This study systematically elucidates the physiological and ecological mechanisms of water–nitrogen coupling regulation on winter wheat yield formation. The main research findings align with existing knowledge while also providing novel breakthroughs and innovations. Regarding water regulation, partial root-zone drying (PRD) can stimulate the production of root-source hormonal signals that promote crop fiber elongation, resulting in minimal yield reduction (p > 0.05) while reducing irrigation water use [49], and PRD is easily performed by localized irrigation method [50], in which application of 150 kg N ha⁻¹ can be suggested as an effective rate to increase winter wheat yield. This finding corroborates the research conclusions of Chu et al. (2016) [51] and Wang et al. (2024) [9], further confirming that moderate water deficit can activate crop drought resistance mechanisms. Recent studies have systematically demonstrated the critical role of irrigation timing and methods in winter wheat productivity. Luan et al. (2025) established that heading-stage irrigation significantly enhances final biomass accumulation (p < 0.01), with water supply during key growth periods explaining 62–75% of yield variation [52]. Wang et al. (2023) demonstrated that while the T1 treatment (pre-sowing + jointing stage irrigation, 44.4% water saving) resulted in varying reductions in wheat yield components, the final grain yield maintained >90% of fully irrigated plots [53]. These findings align precisely with our experimental results showing non-significant yield reduction (p > 0.05) under deficit irrigation regimes, confirming the viability of water-saving strategies in winter wheat production. Complementing these findings, Xing et al. (2025) conducted a comprehensive factorial experiment evaluating five irrigation-fertilization regimes, revealing that the productivity of wheat was the strongest when drip irrigation was combined with organic fertilizer [54].

However, this study employed a convolutional neural network (CNN)-based recognition and yield estimation approach, combined with field experimental validation, to quantitatively analyze the relationship between water–nitrogen coupling and winter wheat yield. The results provide robust evidence for understanding water–nitrogen coupling effects on winter wheat yield formation. Our findings significantly advance both the theoretical knowledge and practical applications of precision water-nitrogen management in cereal production systems. In terms of nitrogen management, irrigation combined with a 3:7 organic–inorganic fertilizer ratio demonstrated optimal yield-increasing effects, consistent with findings from Luan et al. (2025) [52] and Xing et al. (2025) [54]. Xing et al. (2025) also reported that both fertilizer application rates and ratios significantly influence wheat yield, a finding consistent with our experimental results [54]. Our study demonstrated that an optimal organic–inorganic fertilizer combination (7:3 ratio) could increase grain yield compared to conventional fertilization practices. These results corroborate our findings that balanced nutrient management is crucial for achieving both yield enhancement and resource efficiency in wheat production systems. By establishing a yield response surface model, this study determined the optimal ratio to be deficit irrigation combined with a 3:7 organic–inorganic fertilizer ratio (180 kg N/ha nitrogen application).

This study has several limitations. First, experiments were conducted only in the North China Plain, and the applicability of results to other ecological regions requires further verification. Second, the long-term effects of organic–inorganic fertilizer combinations (such as soil carbon sequestration function) have not been fully elucidated. Third, the model does not account for the impact of extreme climate events. Future research will focus on addressing the following issues: (1) conducting joint experiments across multiple ecological regions to establish regionally adaptable models; (2) combining microbiome technologies to deeply analyze the mechanisms of organic fertilizer in soil improvement; and (3) incorporating climate change scenarios to enhance model predictive capabilities. In summary, this study not only deepens the scientific understanding of water–nitrogen coupling effects but, more importantly, establishes a quantifiable and scalable precision management technology system, providing both theoretical basis and technical support for water-saving, fertilizer-reducing, high-yield winter wheat cultivation.

4.3. Model Recognition Capability and Adaptability

This study systematically evaluates the adaptability of CNN models in winter wheat ear recognition and yield prediction, revealing critical challenges and future development directions for deep learning applications in agriculture. The current findings demonstrate that CNN-based yield estimation methods exhibit significant limitations in terms of methodology uniformity and applicability [55,56], particularly showing pronounced performance variations across different field conditions: While achieving relatively good predictive results in experimental fields LC1 and LC2, the models displayed substantially reduced accuracy in fields LM2 and LM3. This phenomenon may be attributed to variations in field management practices, soil characteristics, and microenvironmental factors [57,58].

To enhance model adaptability and robustness, we propose continuous optimization of CNN networks through approaches including architecture adjustment, data volume expansion, and data augmentation techniques [59,60]. Regarding network architecture, the CNN-GRU hybrid model proposed by Luo (2022) provides valuable insights by integrating temporal features [61]. This model demonstrates stable predictive performance during mid-to-late growth stages of winter wheat, with excellent root mean square error and mean absolute error metrics [61]. For data processing, we recommend focusing on image feature extraction during critical growth stages (jointing and flowering stages), following Jin et al. (2023a), which aligns with the optimal observation period selection strategy of the SWAP-ies assimilation system [62]. Additionally, based on Jin’s (2023b) model comparison study [63], we suggest exploring hybrid architectures combining YOLOv5 and Faster-RCNN to balance detection accuracy and efficiency. Notably, differentiated processing strategies may be required for different growth stages: prioritizing canopy structural features during the jointing stage, focusing on ear morphological characteristics during the heading stage, and emphasizing grain number statistics during the filling stage. These findings provide important implications for deep learning applications in agriculture: (1) establishing more targeted network architecture selection criteria; (2) developing transfer learning strategies adaptable to different ecological regions; and (3) constructing enhanced learning frameworks integrating multi-source data. Future research should prioritize addressing model generalization capability in cross-regional applications and real-time processing efficiency, while exploring coupling methods between crop growth mechanism models and deep learning to further improve prediction stability and interpretability. These discoveries not only provide technical references for winter wheat yield prediction, but also offer methodological guidance for intelligent monitoring research on other crops. The research findings hold significant importance for promoting green and high-quality agricultural development.

5. Conclusions

This study innovatively proposes a winter wheat spike identification and yield prediction model based on an improved CNN architecture. The main innovations include (1) designing a novel network architecture that incorporates ResNet50 feature extraction modules and spatial attention mechanisms to enable a more precise focus on key spike feature regions, and (2) developing a density map-based spike counting method for spike detection and establishing a standardized winter wheat spike identification dataset from experimental images, achieving 15% higher accuracy compared to traditional bounding box approaches. Experimental results demonstrate that under identical training conditions, the model achieves spike identification accuracy ranging from 89.0% to 92.1%, representing a 3 percentage point improvement over YOLO series models, with particularly superior performance in dense spike scenarios. In terms of model application, the yield prediction error for conventionally managed plots LC1 and LC2 was consistently maintained below 10%, verifying the model’s reliability. Under deficit irrigation conditions, the 3:7 organic–inorganic fertilizer ratio treatment yielded the highest production, revealing the yield-increasing mechanism of moderate water deficit and confirming that yield levels can be maintained with 30% water savings. It is noteworthy that in the LM5 experimental group (no fertilizer application), although the model prediction error was small (<10%), the actual yield was too low to be of practical value, reflecting the current model’s limitations under extreme management conditions. These research outcomes provide a reliable intelligent monitoring tool for precision winter wheat cultivation, and the methodological framework can be extended to phenotypic analysis research of other staple crops. Future work will focus on optimizing the model’s generalization capability through multi-ecological zone collaborative trials.

Author Contributions

D.W. and Y.C. conceived the study, led the research, and wrote the paper. L.S. and H.Y. carried out data analysis and created the figures. D.W., Y.C., and G.Y. contributed to the writing and editing. H.Y. and S.L. also carried out data analysis and contributed to the writing and editing. Q.D., and J.G. contributed to the development of the study and to the writing and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Henan Province Key R&D and Promotion Special Project (Science and Technology Targeted) (252102110352, 232102321101), the Natural Science Foundation of Henan Province (242300420035), and the National Key Research and Development Program of China (2022YFD1900402).

Data Availability Statement

Data are contained within the article.

Acknowledgments

We appreciate the technical help from Yuetao Liao, Foshan University, and Shiren Li, Sun Yat-sen University. We thank the anonymous reviewers for their valuable reviews and comments on the manuscript.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Xie, J.Y.; Zhang, D.Y.; Jin, N.; Cheng, T.; Zhao, G.; Han, D.; Niu, Z.; Li, W.F. Coupling crop growth models and machine learning for scalable winter wheat yield estimation across major wheat regions in China. Agric. For. Meteorol. 2025, 372, 110687. [Google Scholar] [CrossRef]
Huang, J.X.; Ma, H.Y. Evaluation of regional estimates of winter wheat yield by assimilating three remotely sensed reflectance datasets into the coupled WOFOST-PROSAIL model. European J. Agron. 2019, 102, 1–13. [Google Scholar] [CrossRef]
Zhang, L.; Pang, J.X.; Chen, X.P.; Lu, Z. Carbon emissions, energy consumption and economic growth: Evidence from the agricultural sector of China’s main grain-producing areas. Sci. Total Environ. 2019, 665, 1017–1025. [Google Scholar] [CrossRef] [PubMed]
Ou, J.J.; Ding, B.B.; Feng, P.Y.; Chen, Y.; Yu, L.L.; Liu, D.L.; Srinivasan, R.; Zhang, X.L. How to stop groundwater drawdown in North China Plain? Combining agricultural management strategies and climate change. J. Hydrol. 2025, 647, 132352. [Google Scholar] [CrossRef]
Hu, S.Y.; Qiao, B.W.; Yang, Y.H.; Rees, R.M.; Huang, W.H.; Zou, J.; Zhang, L.; Zheng, H.Y.; Liu, S.Y.; Shen, S.J.; et al. Optimizing nitrogen rates for synergistically achieving high yield and high nitrogen use efficiency with low environmental risks in wheat production—Evidences from a long-term experiment in the North China Plain. Eur. J. Agron. 2023, 142, 126681. [Google Scholar] [CrossRef]
Wang, D.L.; Liu, S.B.; Guo, M.J.; Cheng, Y.H.; Shi, L.F.; Li, J.P.; Yu, Y.J.; Wu, S.Y.; Dong, Q.G.; Ge, J.K.; et al. Optimizing Nitrogen Fertilization and Irrigation Practices for Enhanced Winter Wheat Productivity in the North China Plain: A Meta-Analysis. Plants 2025, 14, 1686. [Google Scholar] [CrossRef] [PubMed]
Cui, Z.K.; Yu, Z.W.; Shi, Y.; Zhang, Y.L.; Zhang, Z. Effects of water and nitrogen management on photosynthetic matter production and yield of wheat. Chin. J. Appl. Ecol. 2024, 35, 1564–1572. [Google Scholar] [CrossRef]
Zhu, Y.G.; Liu, J.; Li, J.Q.; Xian, L.S.; Chu, J.P.; Liu, H.; Song, J.; Sun, Y.H.; Dai, Z.M. Delayed sowing increased dry matter accumulation during stem elongation in winter wheat by improving photosynthetic yield and nitrogen accumulation. Eur. J. Agron. 2023, 151, 127004. [Google Scholar] [CrossRef]
Wang, Y.X.; Xu, Y.R.; Guo, Q.; Li, H.; Zhang, P.; Cai, T.; Jia, Z.K. Increasing winter wheat yield and nitrogen utilization, and reducing residual soil nitrogen in semi-humid areas: A study matching deep fertilizer application with regional water scenarios. Eur. J. Agron. 2024, 153, 127065. [Google Scholar] [CrossRef]
Xing, S.L.; Zhang, G.L. The Current Application Status and Prospect of Agricultural Remote Sensing in China. Trans. CSAE 2003, 19, 174–178. [Google Scholar] [CrossRef]
Weiss, M.; Jacob, F.; Duveiller, G. Remote sensing for agricultural applications: A meta-review. Remote Sens. Environ. 2020, 236, 111402. [Google Scholar] [CrossRef]
Zhao, L.C.; Li, F.L.; Chang, Q.R. A Review of Remote Sensing Identification and Yield Estimation of Crops. Trans. Chin. Soc. Agric. Mach. 2023, 54, 1–19. [Google Scholar] [CrossRef]
Wu, B.F.; Zhang, M.; Zeng, H.W. 20 years of Global Agricultural Situation Remote Sensing Rapid Reporting System. Natl. Remote Sens. Bull. 2019, 23, 1053–1063. [Google Scholar] [CrossRef]
Chen, Z.X.; Ren, J.Q.; Tang, H.J. Progress and Prospect of Agricultural Remote Sensing Research and Application. Natl. Remote Sens. Bull. 2016, 20, 748–767. [Google Scholar] [CrossRef]
Ma, Z.L.; Wen, F.; Zhou, Y.J. Regional Winter-wheat Yield Estimation Based on Coupling of Machine Learning Algorithm and Crop Growth Model. Trans. Chin. Soc. Agric. Mach. 2023, 54, 136–147. [Google Scholar] [CrossRef]
Zhu, Z.C.; Chen, L.J.; Zhang, J.S. Winter wheat yield model based on information diffusion and key remote sensing data. Trans. CSAE 2011, 27, 187–193. [Google Scholar] [CrossRef]
Ren, J.Q.; Chen, Z.X.; Tang, H.J. Regional crop yield simulation based on remote sensing information and crop growth models. Trans. CSAE 2011, 27, 257–264. [Google Scholar] [CrossRef]
Li, R.; Li, C.J.; Xv, X.G. Estimation of winter wheat yield based on Support Vector Regression (SVR) and multi-temporal remote sensing data. Trans. CSAE 2009, 25, 114–117. [Google Scholar] [CrossRef]
Wang, J.J.; Li, C.S.; Zhuo, Y. Estimation of winter wheat yield based on the optimal growth period of multi-temporal unmanned aerial vehicle remote sensing. Trans. Chin. Soc. Agric. Mach. 2022, 53, 197–206. [Google Scholar] [CrossRef]
Wang, P.X.; Tian, H.R.; Zhang, Y. Research Progress on Crop Growth Monitoring and Yield Estimation Based on Deep Learning. Trans. Chin. Soc. Agric. Mach. 2022, 53, 1–14. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Redmon, J.; Divvala, S.; Girshick, R. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Xu, L.; Wang, Y.; Shi, X. Real-time and accurate detection of citrus in complex scenes based on HPL-YOLOv4. Comput. Electron. Agric. 2023, 205, 107590. [Google Scholar] [CrossRef]
Wang, D.Y.; Fu, Y.Y.; Yang, G.J. Combined Use of FNC and Harris Corner Detection for Counting Wheat Ears in Field Conditions. IEEE Access 2019, 7, 178930–178941. [Google Scholar] [CrossRef]
Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 640–651. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.X.; Chen, Y.Q.; Li, Y.X. Winter wheat ear Detection and Counting System based on convolutional Neural Network. Trans. Chin. Soc. Agric. Mach. 2019, 50, 144–150. [Google Scholar] [CrossRef]
Neubeck, A.; Van Gool, L. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 22–24 August 2006; pp. 850–855. [Google Scholar] [CrossRef]
Patel, J.; Ruparelia, A.; Tanwar, S. Deep Learning-Based Model for Detection of Brinjal Weed in the Era of Precision Agriculture; Tech Science Press: Henderson, NV, USA, 2023. [Google Scholar] [CrossRef]
Chen, S.; Xv, W.F.; Wang, H.T. Wheat ear detection algorithm based on the improved YOLO v7. J. Jilin Univ. 2024, 62, 0886–09. [Google Scholar] [CrossRef]
Quan, Y.; Zhang, D.; Zhang, L.Y. Centralized Feature Pyramid for Object Detection. IEEE Trans. Image Process. 2023, 32, 4341–4354. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Qi, L.; Qin, H.F. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 8759–8768. [Google Scholar] [CrossRef]
Wang, X.; Lu, H.; Li, J. Influence of Time-Series Length and Hyperparameters on Temporal Convolutional Neural Network Training in Low-Power Battery SOC Estimation. Appl. Sci. 2023, 13, 134–147. [Google Scholar] [CrossRef]
Milioto, A.; Lottes, P.; Stachniss, C. Real-time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018. [Google Scholar] [CrossRef]
Zhou, L.; Mu, H.W.; Ma, H.J. Dry yield estimation of winter wheat in northern China based on convolutional neural networks. Trans. CSAE 2019, 35, 119–128. [Google Scholar]
Liu, L.B.; Wang, T.; Zhang, P. Hyperspectral image yield estimation method of Ningxia Wolfberry based on CNN-S-GPR. Trans. Chin. Soc. Agric. Mach. 2022, 53, 1000–1298. [Google Scholar] [CrossRef]
Theckedath, D.; Sedamkar, R.R. Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks. SN Comput. Sci. 2020, 1, 79. [Google Scholar] [CrossRef]
Qassim, H.; Verma, A.; Feinzimer, D. Compressed residual-VGG16 CNN model for big data places image recognition. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 169–175. [Google Scholar] [CrossRef]
Zhang, Y.L.; Lu, H. Zero-watermarking algorithm for vector maps based on the ResNet50 model. Geogr. Geo-Inf. Sci. 2024, 40, 1672-0504. [Google Scholar] [CrossRef]
Bengio, Y.; Simard, P.; Frascon, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
Gong, R.; Liu, X.; Jiang, S. Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4852–4861. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Lin, Y.H.; Lv, Z.L.; Yang, C.C. Identification and Experimentation of Overlapping Pomelos in Natural Scene Images. Trans. CSAE 2021, 37, 158–167. [Google Scholar] [CrossRef]
Mebatsion, H.K.; Paliwal, J. A Fourier analysis based algorithm to separate touching kernels in digital images. Biosyst. Eng. 2011, 108, 66–74. [Google Scholar] [CrossRef]
Xun, Y.; Bao, G.J.; Yang, Q.H. Automatic segmentation method for images of adhered corn kernels. Trans. Chin. Soc. Agric. Mach. 2010, 41, 163–167. [Google Scholar] [CrossRef]
Visen, N.S.; Shashidhar, N.S.; Paliwal, J. AE—Automation and Emerging Technologies: Identification and Segmentation of Occluding Groups of Grain Kernels in a Grain Sample Image. J. Agric. Eng. Res. 2001, 79, 159–166. [Google Scholar] [CrossRef]
Cao, T.C.; He, X.H.; Dong, D.L. Wheat imperfect grain recognition based on CNN deep model. Mod. Comput. 2017, 36, 9–14. [Google Scholar] [CrossRef]
Li, Y.X.; Ma, J.C.; Liu, H.J. Field growth parameter estimation system of winter wheat using RGB digital images and deep learning. Trans. CSAE 2021, 37, 189–198. [Google Scholar] [CrossRef]
Zhu, S.P.; Zhuo, J.X.; Huang, H. Wheat Grain Integrity Image Detection System Based on CNN. Trans. Chin. Soc. Agric. Mach. 2020, 51, 36–42. [Google Scholar] [CrossRef]
Liu, L.; Yang, Z.H.; Zhao, W.Q.; Ding, C.H.; Hu, W.; Du, K.; Wang, S.S.; Zhou, Z.G. Partial root-zone drying enhances cotton fiber elongation by boosting the production of root-source jasmonates to counter drought stress. Ind. Crops Prod. 2024, 222, 120088. [Google Scholar] [CrossRef]
Mehrabi, F.; Ali Reza Sepaskhah, A.R. Partial root zone drying irrigation, planting methods and nitrogen fertilization influence on physiologic and agronomic parameters of winter wheat. Agric. Water Manag. 2019, 223, 105688. [Google Scholar] [CrossRef]
Chu, G.H.; Yang, L.X. Research on the Effects of Insufficient Irrigation on Winter Wheat Yield and Water Use Efficiency. Water Sav. Irrig. 2016, 2016, 1007–4929. [Google Scholar] [CrossRef]
Luan, Q.H.; Bi, H.K.; Zhang, C.H. Research on Optimization of Irrigation Nodes for Winter Wheat in Irrigation Areas and Simulation of Yield Effects[J/OL]. South-to-North Water Transfers and Water Science & Technology. Available online: https://link.cnki.net/urlid/13.1430.TV.20250124.1628.006 (accessed on 28 May 2025).
Wang, Z.Y.; Li, X.Q.; Zhang, J.Q. Effects of water-saving irrigation on photosynthetic and cell protection system parameters and yield traits of winter wheat/summer maize double cropping crops. Highlights Sci. Online 2023, 9, 343–354. [Google Scholar] [CrossRef]
Xing, S.L.; Wang, J.X.; Yang, J.F. Effects of fertilizer management on wheat yield, carbon and nitrogen footprint under drip irrigation. Trans. CSAE 2025, 41, 103–111. [Google Scholar] [CrossRef]
Bao, W.X.; Yang, X.H.; Liang, D. Lightweight convolutional neural network model for field wheat ear disease identification. Comput. Electron. Agric. 2021, 189, 106367. [Google Scholar] [CrossRef]
Li, L.; Hassan, M.A.; Yang, S.R.; Jing, F.R. Development of image-based wheat spike counter through a Faster R-CNN algorithm and application for genetic studies. Crop J. 2022, 10, 1303–1311. [Google Scholar] [CrossRef]
Mahalakshmi, S.; Anand, A.J.; Partheeban, P. Soil and crop interaction analysis for yield prediction with satellite imagery and deep learning techniques for the coastal regions. J. Environ. Manag. 2025, 380, 125095. [Google Scholar] [CrossRef] [PubMed]
Zhu, G.C.; Zhao, C.X.; Zhou, L.L.; Li, Z.H.; Zhu, H.C. Winter wheat yield prediction at a county scale using time series variation features of remote sensing spectra and machine learning. Eur. J. Agron. 2025, 170, 127751. [Google Scholar] [CrossRef]
Zaji, A.; Liu, Z.; Xiao, G.Z. Wheat spike localization and counting via hybrid UNet architectures. Comput. Electron. Agric. 2022, 203, 107439. [Google Scholar] [CrossRef]
Fan, M.Y.; Ma, Q.; Liu, J.M. Wheat ear counting method in field environment based on machine vision. Trans. Chin. Soc. Agric. Mach. 2015, 46 (Suppl. S1), 234–239. [Google Scholar] [CrossRef]
Luo, H.T. Research on Winter Wheat Yield Prediction Method Based on Hybrid Neural Network. Master’s Thesis, Zhengzhou University, Zhengzhou, China, 2022. [Google Scholar]
Jin, J.X.; Ding, Y.M.; Sun, Z.Y. Numerical simulation of spring wheat growth and yield in arid areas based on SWAP-IES. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2023, 39, 66–76. [Google Scholar] [CrossRef]
Jin, H. Research on Recognition and Counting of Mature Wheat Ears Based on Deep Learning. Doctoral Dissertation, Shandong Agricultural University, Tai’an, China, 2023. [Google Scholar]

Figure 1. Distribution of winter wheat planting area and location of experimental stations.

Figure 2. Preparation of images for spike detection. (a) Experimental image labeled with red lines; (b) Training image marked with LabelMe 3.16.7.

Figure 3. CNN network training effect.

Figure 4. ResNet50 model embedded in the convolutional neural network.

Figure 5. Network structure design diagram.

Figure 6. The loss values of YOLO and CNN.

Figure 7. Image samples of wheat spikes at different growth stages recorded by a UAV.

Figure 8. Different treatments of CNN networks identify the number of wheat ears and the corresponding density map. (The darker the color, the greater the density).

Figure 9. Comparison of the number of wheat ears and yield identified by the CNN model under different water and fertilizer treatments with the actual experimental results. (a) Histogram of CNN-identified panicle number and actual panicle number. (b) CNN-identified yield versus actual yield histogram.

Figure 10. Comparison of the number of wheat ears and yield identified by the CNN model under different water and fertilizer treatments with the actual experimental results.

Table 1. Physico-chemical properties of soil layers in the test site.

Soil Depth (cm)	Volume Mass (g/cm³)	Field Water Capacity (cm³/cm³)	Nitrate Nitrogen Content (mg/cm³)	Ammonia Nitrogen (mg/cm³)	Organic Matter (g/kg)	Total Nitrogen (g/kg)
0–20	1.35	32	0.0368	0.0104	9.16	0.5665
20–40	1.56	34	0.0204	0.0033	6.67	0.3635
40–60	1.41	34	0.0132	0.0018	2.79	0.1945

Table 2. A two-factor field trial was conducted during the 2024–2025 growing seasons, incorporating two distinct water application levels and five organic–inorganic fertilizer blend ratios.

Fertilizer Blend Ratios	Single Organic Fertilizer (1)	Organic Fertilizer: Inorganic Fertilizer 7:3 (2)	Organic Fertilizer: Inorganic Fertilizer 3:7 (3)	Full Chemical Fertilizer (4)	No Fertilizer (5)
Sufficient irrigation (C)	LC1	LC2	LC3	LC4	LC5
Deficit irrigation (M)	LM1	LM2	LM3	LM4	LM5

Table 3. Object detection for the YOLO-CNN network (GPU model RTX2070super).

Model	Training Dataset	Precision (map)	Batch Size	Inference GFLOPs	Training Time (min)	Epochs	Train GFLOPs	Loss
YOLOv8	wheat-detection	89.1 ± 0.015 ab	4	0.7	53	100	0.9	0.629
CNN	wheat-detection	92.1 ± 0.012 a	4	1.7	714	100	6.5	0.630

Note: Significant differences between YOLOv8 and CNN are indicated by lowercase letters at p < 0.05 (LSD).

Table 4. Experimental data of winter wheat under different water and nitrogen treatments.

Treatment	Grains Per Spike n (grain)	Effective Panicles N (pieces)	Thousand Seed Weight G (g)	Actual Yield M (kg/ha)	Number of Samples
LC1	35 ± 2 a	580 ± 29 a	42.33 ± 2.11 a	7976.23 ± 398.81 a	18
LC2	26 ± 1 ab	547 ± 27 ab	43.10 ± 2.16 a	8185.54 ± 409.28 ab	18
LC3	26 ± 1 ab	660 ± 33 a	45.37 ± 2.27 a	9363.38 ± 468.17 a	18
LC4	31 ± 2 a	650 ± 33 a	41.13 ± 2.06 ab	9057.62 ± 452.88 a	18
LC5	20 ± 1 c	378 ± 19 c	42.23 ± 2.11 a	4767.23 ± 238.36 b	18
LM1	29 ± 2 a	566 ± 28 bc	42.53 ± 2.13 a	8357.53 ± 417.88 ab	18
LM2	28 ± 1 ab	724 ± 36 ab	44.00 ± 2.20 ab	10,146.77 ± 507.34 ab	18
LM3	31 ± 2 a	760 ± 38 a	42.07 ± 2.10 a	9811.28 ± 490.56 a	18
LM4	28 ± 1 ab	739 ± 37 ab	42.20 ± 2.11 a	10,485.51 ± 524.28 a	18
LM5	21 ± 1 c	407 ± 20 c	41.13 ± 2.06 a	4077.69 ± 203.88 b	18

Note: Significant differences among the treatments are indicated by lowercase letters at p < 0.05 (LSD).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, D.; Cheng, Y.; Shi, L.; Yin, H.; Yang, G.; Liu, S.; Dong, Q.; Ge, J. Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50. Agronomy 2025, 15, 1755. https://doi.org/10.3390/agronomy15071755

AMA Style

Wang D, Cheng Y, Shi L, Yin H, Yang G, Liu S, Dong Q, Ge J. Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50. Agronomy. 2025; 15(7):1755. https://doi.org/10.3390/agronomy15071755

Chicago/Turabian Style

Wang, Donglin, Yuhan Cheng, Longfei Shi, Huiqing Yin, Guangguang Yang, Shaobo Liu, Qinge Dong, and Jiankun Ge. 2025. "Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50" Agronomy 15, no. 7: 1755. https://doi.org/10.3390/agronomy15071755

APA Style

Wang, D., Cheng, Y., Shi, L., Yin, H., Yang, G., Liu, S., Dong, Q., & Ge, J. (2025). Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50. Agronomy, 15(7), 1755. https://doi.org/10.3390/agronomy15071755

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Winter Wheat Yield and Interpretable Accuracy Under Different Water and Nitrogen Treatments Based on CNNResNet-50

Abstract

1. Introduction

1.1. Challenges of Water–Fertilizer Management in the Context of Food Security

1.2. Comparative Analysis of Major Crop Yield Prediction Methods

1.3. Development and Innovative Applications of CNN in Crop Yield Prediction

2. Materials and Methods

2.1. Field Experimental Condition and Design

2.2. Construction of Experimental Remote Sensing Image Dataset

2.3. Network Structure Design

2.3.1. Experimental Image Dataset

2.3.2. Faster-RCNN Network Structure

2.3.3. Improved CNN Based on ResNet50 Feature Extraction Network

2.3.4. Non-Maximum Suppression Algorithm (NMS)

2.4. Main Evaluation Indicators of Winter Wheat Yield Prediction Model

3. Results

3.1. CNN and YOLO’s Performance Comparison

3.2. Influence of Different Water and Nitrogen Treatments on the Yield of Winter Wheat

3.3. CNN Analysis of Production Estimation Accuracy

4. Discussion

4.1. Improving Wheat Ear Recognition Accuracy Using Modified CNN

4.2. Effects of Water–Nitrogen Coupling on Winter Wheat Yield

4.3. Model Recognition Capability and Adaptability

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI