Remote Sensing and Data-Driven Optimization of Water and Fertilizer Use: A Case Study of Maize Yield Estimation and Sustainable Agriculture in the Hexi Corridor
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Data Sources and Preprocessing
2.2.1. Remote Sensing Data Processing Workflow
2.2.2. Feedback Mechanism Diagram
2.2.3. Classification Mask Integration
2.2.4. Statistical Data Processing
2.2.5. Feature Engineering
2.2.6. Final Dataset Construction
3. Construction and Validation of a County-Level Maize Yield Estimation Model
3.1. Modeling Approach and Iterative Process
3.1.1. Phase One: Benchmark Model Construction-Ordinary Least Squares (OLS)
3.1.2. Phase Two: Advanced Model Exploration-Random Forest (RF)
3.1.3. Phase Three: Final Model Selection and Optimization-Weighted Least Squares (WLS)
3.1.4. Model Validation Strategy
- ①
- Goodness-of-fit evaluation can be conducted using various metrics, including the coefficient of determination (R2), adjusted R2 (Adjusted R2), root mean square error (RMSE), and mean absolute percentage error (MAPE). These metrics quantify the model’s explanatory power concerning yield variance and the magnitude of prediction errors.
- ②
- This study employed a 5-fold cross-validation method. The dataset, comprising 1000 samples with five key features—NDVI, EVI, soil moisture, temperature, and precipitation—was randomly divided into five subsets, each containing approximately 200 samples. Four subsets were used for training, while one subset was reserved for testing in each iteration. The stability and generalization ability of the model were evaluated by calculating the mean and standard deviation of the five test results, thereby effectively mitigating the risk of overfitting.
- ③
- The hold-out method is employed for validation across the time dimension, utilizing data from 2019 to 2022 as the training set, while designating data from 2023 as the independent test set. This approach directly evaluates the model’s predictive capability for future years, thereby assessing its practical utility.
- ④
- Statistical Diagnostic Tests: Conduct comprehensive diagnostics on the residuals of the final model to ensure that the fundamental assumptions of linear regression are satisfied.
- ⑤
- Linearity: Test through residual plots.
- ⑥
- Independence: Use the Durbin-Watson test to determine whether the residuals exhibit autocorrelation.
- ⑦
- Normality: Use the Jarque–Bera test to determine whether the residuals follow a normal distribution.
- ⑧
- Homoscedasticity was assessed using the Breusch-Pagan test to determine whether the variance of the residuals remains constant. Ultimately, the weighted least squares (WLS) model passed all diagnostic tests, demonstrating that the model is statistically robust, unbiased, and reliable.
4. Results and Analysis
4.1. Model Performance Comparison
4.2. Feature Importance Analysis
4.3. Final Model Performance Analysis
Performance Comparison Between Baseline Model and Optimized Model
4.4. Visualization Analysis of Results
4.4.1. Model Performance Comparison and Prediction Effectiveness Analysis
4.4.2. Predicted Values and Actual Values
4.4.3. Model Residual Spatial Distribution
4.4.4. Comprehensive Model Diagnostics and Validation Analysis
5. Water and Fertilizer Optimization and Sustainable Agriculture
6. Discussion
6.1. Key Findings
6.1.1. The Dominant Role of NDVI During the Vigorous Growth Period
6.1.2. The Importance of Nonlinear Relationships
6.2. Reflection and Argumentation on Methodology
6.3. Comparison with Related Studies
6.4. Limitations of the Study and Future Prospects
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhu, H.; He, X.; Wang, X.; Long, P. Increasing hybrid rice yield, water productivity, and nitrogen use efficiency: Optimization strategies for irrigation and fertilizer management. Plants 2024, 13, 1717. [Google Scholar] [CrossRef]
- De Villiers, C.; Mashaba-Munghemezulu, Z.; Munghemezulu, C.; Chirima, G.J.; Tesfamichael, S.G. Assessing Maize Yield Spatiotemporal Variability Using Unmanned Aerial Vehicles and Machine Learning. Geomatics 2024, 4, 213–236. [Google Scholar] [CrossRef]
- Yang, X.; Zhang, L.; Liu, X. Optimizing Water-Fertilizer Integration with Drip Irrigation Management to Improve Crop Yield, Water, and Nitrogen Use Efficiency: A Meta-Analysis Study. Sci. Hortic. 2024, 338, 113653. [Google Scholar] [CrossRef]
- Fang, L.; Zhang, G.; Ming, B.; Shen, D.; Wang, Z.; Zhou, L.; Zhang, T.; Liang, Z.; Xue, J.; Xie, R. Dense Planting and Nitrogen Fertilizer Management Improve Drip-Irrigated Spring Maize Yield and Nitrogen Use Efficiency in Northeast China. J. Integr. Agric. 2024. [Google Scholar] [CrossRef]
- Zhang, X.; Shen, H.; Huang, T.; Wu, Y.; Guo, B.; Liu, Z.; Luo, H.; Tang, J.; Zhou, H.; Wang, L. Improved Random Forest Algorithms for Increasing the Accuracy of Forest Aboveground Biomass Estimation Using Sentinel-2 Imagery. Ecol. Indic. 2024, 159, 111752. [Google Scholar] [CrossRef]
- Radeloff, V.C.; Roy, D.P.; Wulder, M.A.; Anderson, M.; Cook, B.; Crawford, C.J.; Friedl, M.; Gao, F.; Gorelick, N.; Hansen, M.; et al. Need and Vision for Global Medium-Resolution Landsat and Sentinel-2 Data Products. Remote Sens. Environ. 2024, 300, 113918. [Google Scholar] [CrossRef]
- Mateo-Sanchis, A.; Piles, M.; Muñoz-Marí, J.; Adsuara, J.E.; Pérez-Suay, A.; Camps-Valls, G. Synergistic Integration of Optical and Microwave Satellite Data for Crop Yield Estimation. Remote Sens. Environ. 2019, 234, 111460. [Google Scholar] [CrossRef]
- Joshi, V.R.; Thorp, K.R.; Coulter, J.A.; Johnson, G.A.; Porter, P.M.; Strock, J.S.; Garcia y Garcia, A. Improving Site-Specific Maize Yield Estimation by Integrating Satellite Multispectral Data into a Crop Model. Agronomy 2019, 9, 719. [Google Scholar] [CrossRef]
- Vani, V.; Mandla, V.R. Comparative Study of NDVI and SAVI Vegetation Indices in Anantapur District Semi-Arid Areas. Int. J. Civ. Eng. Technol. 2017, 8, 559–566. [Google Scholar]
- Burton, A.L. OLS (Linear) Regression. Encycl. Res. Methods Criminol. Crim. Justice 2021, 2, 509–514. [Google Scholar]
- Li, X.; Lyu, Y.; Zhu, B.; Liu, L.; Song, K. Maize Yield Estimation in Northeast China’s Black Soil Region Using a Deep Learning Model with Attention Mechanism and Remote Sensing. Sci. Rep. 2025, 15, 12927. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhao, X.; Han, Y.; Yang, F.; Pan, S.; Liu, Z.; Wang, K.; Zhao, C. Maize Yield Prediction Using Federated Random Forest. Comput. Electron. Agric. 2023, 210, 107930. [Google Scholar] [CrossRef]
- Daviran, M.; Maghsoudi, A.; Ghezelbash, R. Optimized AI-MPM: Application of PSO for Tuning the Hyperparameters of SVM and RF Algorithms. Comput. Geosci. 2025, 195, 105785. [Google Scholar] [CrossRef]
- Wang, J.; Fang, F.; Wang, J.; Yue, P.; Wang, S.; Xu, Y. Evolutionary Characteristics and Influencing Factors of Wheat Production Risk in Gansu Province of China under the Background of Climate Change. Theor. Appl. Climatol. 2024, 155, 5389–5415. [Google Scholar] [CrossRef]
- Yang, G.; Wang, J.; Qi, Z. Maize Classification in Arid Regions via Spatiotemporal Feature Optimization and Multi-Source Remote Sensing Integration. Agronomy 2025, 15, 1667. [Google Scholar] [CrossRef]
- Cai, T.; Chang, C.; Zhao, Y.; Wang, X.; Yang, J.; Dou, P.; Otgonbayar, M.; Zhang, G.; Zeng, Y.; Wang, J.; et al. Within-Season Estimates of 10 m Aboveground Biomass Based on Landsat, Sentinel-2 and PlanetScope Data. Sci. Data 2024, 11, 1276. [Google Scholar] [CrossRef]
- Li, M.; Wang, G.; Sun, A.; Wang, Y.; Li, F.; Liang, S. Monitoring Grassland Variation in a Typical Area of the Qinghai Lake Basin Using 30 m Annual Maximum NDVI Data. Remote Sens. 2024, 16, 1222. [Google Scholar] [CrossRef]
- Kabato, W.; Getnet, G.T.; Sinore, T.; Nemeth, A.; Molnár, Z. Towards Climate-Smart Agriculture: Strategies for Sustainable Agricultural Production, Food Security, and Greenhouse Gas Reduction. Agronomy 2025, 15, 565. [Google Scholar] [CrossRef]
- Manley, M.; Baeten, V. Spectroscopic technique: Near infrared (NIR) spectroscopy. In Modern Techniques for Food Authentication; Elsevier: Amsterdam, The Netherlands, 2018; pp. 51–102. [Google Scholar]
- Anees, S.A.; Mehmood, K.; Rehman, A.; Rehman, N.U.; Muhammad, S.; Shahzad, F.; Hussain, K.; Luo, M.; Alarfaj, A.A.; Alharbi, S.A.; et al. Unveiling Fractional Vegetation Cover Dynamics: A Spatiotemporal Analysis Using MODIS NDVI and Machine Learning. Environ. Sustain. Indic. 2024, 24, 100485. [Google Scholar] [CrossRef]
- Zdaniuk, B. Ordinary Least-Squares (OLS) Model. In Encyclopedia of Quality of Life and Well-Being Research; Springer: Berlin/Heidelberg, Germany, 2024; pp. 4867–4869. [Google Scholar]
- Asamoah, E.; Heuvelink, G.B.; Chairi, I.; Bindraban, P.S.; Logah, V. Random Forest Machine Learning for Maize Yield and Agronomic Efficiency Prediction in Ghana. Heliyon 2024, 10, e37065. [Google Scholar] [CrossRef] [PubMed]
- Gallagher, N.B.; Goyetche, R.; Amigo, J.M.; Kucheryavskiy, S. Extended Least Squares (ELS) and Generalized Least Squares (GLS) for Clutter Suppression in Hyperspectral Images: A Theoretical Description. Chemom. Intell. Lab. Syst. 2024, 244, 105032. [Google Scholar] [CrossRef]
- Patil, P.P.; Jagtap, M.P.; Khatri, N.; Madan, H.; Vadduri, A.A.; Patodia, T. Exploration and advancement of NDDI leveraging NDVI and NDWI in Indian semi-arid regions: A remote sensing-based study. Case Stud. Chem. Environ. Eng. 2024, 9, 100573. [Google Scholar] [CrossRef]
- Farbo, A.; Sarvia, F.; De Petris, S.; Basile, V.; Borgogno-Mondino, E. Forecasting Maize NDVI through AI-based approaches using sentinel 2 image time series. ISPRS J. Photogramm. Remote Sens. 2024, 211, 244–261. [Google Scholar] [CrossRef]
- Santana, C.T.C.d.; Sanches, I.D.A.; Caldas, M.M.; Adami, M. A Method for Estimating Soybean Sowing, Beginning Seed, and Harvesting Dates in Brazil Using NDVI-MODIS Data. Remote Sens. 2024, 16, 2520. [Google Scholar] [CrossRef]
- Karlson, M.; Ostwald, M.; Bayala, J.; Bazié, H.R.; Ouedraogo, A.S.; Soro, B.; Sanou, J.; Reese, H. The potential of Sentinel-2 for crop production estimation in a smallholder agroforestry landscape, Burkina Faso. Front. Environ. Sci. 2020, 8, 85. [Google Scholar] [CrossRef]
- Karra, K.; Kontgis, C.; Statman-Weil, Z.; Mazzariello, J.C.; Mathis, M.; Brumby, S.P. Global Land Use/Land Cover with Sentinel 2 and Deep Learning. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4704–4707. [Google Scholar]
- Dížková, P.; Bartošová, L.; Bláhová, M.; Balek, J.; Hájková, L.; Semerádová, D.; Bohuslav, J.; Pohanková, E.; Žalud, Z.; Trnka, M. Modeling phenological phases of winter wheat based on temperature and the start of the growing season. Atmosphere 2022, 13, 1854. [Google Scholar] [CrossRef]
- Zhou, Y.; Liu, Y.; Wang, D.; Liu, X.; Wang, Y. A review on global solar radiation prediction with machine learning models in a comprehensive perspective. Energy Convers. Manag 2021, 235, 113960. [Google Scholar] [CrossRef]
- Hu, X.; Shi, L.; Lin, G.; Lin, L. Comparison of physical-based, data-driven and hybrid modeling approaches for evapotranspiration estimation. J. Hydrol. 2021, 601, 126592. [Google Scholar] [CrossRef]
- Hossain, M.M.; Rahman, M.A.; Chaki, S.; Ahmed, H.; Haque, A.; Tamanna, I.; Lima, S.; Most, J.F.; Rahman, M.S. Smart-Agri: A smart agricultural management with IoT-ML-blockchain integrated framework. Int. J. Adv. Comput. Sci. Appl. 2023, 14. [Google Scholar] [CrossRef]
- Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
- Wang, J.; Si, H.; Gao, Z.; Shi, L. Winter wheat yield prediction using an LSTM model from MODIS LAI products. Agriculture 2022, 12, 1707. [Google Scholar] [CrossRef]
Data Type | Source | Product/Description | Spatial Resolution | Temporal Resolution | Time Coverage Range |
---|---|---|---|---|---|
Remote sensing data | Copernicus Project/Google Earth Engine | Sentinel-2 L2A (atmospheric corrected surface reflectance) | 10–20 m | 5 days | 2019–2023 (May-September) |
Production statistics | Provincial and municipal statistical yearbooks | County level total Maize production (tons), sowing area (hectares) | county level | year | 2019–2023 |
Model | Key Features | Verify R2 | Verify RMSE (%) | Is it Diagnosed Through Statistical Analysis |
---|---|---|---|---|
Benchmark OLS model | Linear vegetation index | 0.78 | 18.5% | deny |
Random Forest (RF) | All vegetation indices, automatic nonlinear fitting | Approximately 0.85 (random CV)/<0 (time CV) | High/extremely high | No (unstable) |
WLS model | NDVIAug, NDVIAug2 | 0.89 | 12.8% | correct |
Sample Point | Year | Water Usage Change | Fertilizer Usage Change | Yield Change | Water and Fertilizer Savings |
---|---|---|---|---|---|
Zhangye | 2023 | 215 → 193 m3/acre | 5.5 → 5.2 tons/acre | 6853 → 6776 kg/ha | 10.23%/5.5% |
Zhangye | 2024 | 210 → 188 m3/acre | 5.0 → 4.7 tons/acre | 6780 → 6690 kg/ha | 11.43%/6.0% |
Wuwei | 2023 | 210 → 190 m3/acre | 5.2 → 4.8 tons/acre | 7217 → 6990 kg/ha | 14.76%/8.5% |
Wuwei | 2024 | 205 → 179 m3/acre | 5.0 → 4.6 tons/acre | 7150 → 7111 kg/ha | 12.68%/7.8% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, G.; Wang, J.; Qi, Z. Remote Sensing and Data-Driven Optimization of Water and Fertilizer Use: A Case Study of Maize Yield Estimation and Sustainable Agriculture in the Hexi Corridor. Sustainability 2025, 17, 8182. https://doi.org/10.3390/su17188182
Yang G, Wang J, Qi Z. Remote Sensing and Data-Driven Optimization of Water and Fertilizer Use: A Case Study of Maize Yield Estimation and Sustainable Agriculture in the Hexi Corridor. Sustainability. 2025; 17(18):8182. https://doi.org/10.3390/su17188182
Chicago/Turabian StyleYang, Guang, Jun Wang, and Zhengyuan Qi. 2025. "Remote Sensing and Data-Driven Optimization of Water and Fertilizer Use: A Case Study of Maize Yield Estimation and Sustainable Agriculture in the Hexi Corridor" Sustainability 17, no. 18: 8182. https://doi.org/10.3390/su17188182
APA StyleYang, G., Wang, J., & Qi, Z. (2025). Remote Sensing and Data-Driven Optimization of Water and Fertilizer Use: A Case Study of Maize Yield Estimation and Sustainable Agriculture in the Hexi Corridor. Sustainability, 17(18), 8182. https://doi.org/10.3390/su17188182