Development and Comparison of Artificial Neural Networks and Gradient Boosting Regressors for Predicting Topsoil Moisture Using Forecast Data
Abstract
:1. Introduction
2. Related Works
2.1. Neural Networks
2.2. Gradient Boosting Models and Ensemble Techniques
3. Materials and Methods
3.1. Background
3.1.1. Probes’ Data
3.1.2. Weather Stations Data
3.1.3. Meteorological Models’ Forecasts
- -
- Historical data: these data are retrieved from multiple sources, such as weather stations, satellites, and radars. Normally, to collect historical records from a concrete latitude and longitude, three weather stations in a range of 50 km around the latitude and longitude specified are used [68,69]. Due to this low spatial resolution, the data retrieved are less accurate and reliable than those obtained from the weather stations mentioned in Section 3.1.2.
- -
- Weather forecast: current data are combined with mathematical and physics algorithms to obtain weather forecasts using meteorological models; in this case, these are the models of ECMWF (European Centre for Medium-Range Weather Forecasts), GFS (Global Forecast System), ICON (Icosahedral Nonhydrostatic Model from the German Meteorological Office), NAM (North American Mesoscale Model), and HRRR (High-Resolution Rapid Refresh). The models are continuously adjusted with new data to improve accuracy, through the calibration and validation of the generated data using real data to ensure their precision and coincidence with the real conditions [69,70].
3.1.4. Evapotranspiration
3.1.5. Artificial Neural Networks
3.1.6. Gradient Boosting Regressors
3.2. Proposed Solution
3.2.1. Area of Study
- -
- Vertex 1: Latitude: 38°19′58.65″ N—Longitude: 2°26′12.26″ W;
- -
- Vertex 2: Latitude: 37°33′52.96″ N—Longitude: 1°23′4.54″ W.
3.2.2. Dataset Construction
- -
- Real data from the closest weather stations to each of the probes: from each weather station, daily data were collected for nine variables: minimum temperature, average temperature, maximum temperature, minimum relative humidity, average relative humidity, maximum relative humidity, average wind speed, accumulated precipitation, and average solar radiation.
- -
- Historical data from Visual Crossing: using the limited free access API, daily data from seven variables were obtained: minimum temperature, maximum temperature, minimum relative humidity, maximum relative humidity, average wind speed, accumulated precipitation, and average solar radiation.
- -
- The first dataset covers data from May 2022 to June 2024 and takes into account all the environmental variables (minimum temperature, average temperature, maximum temperature, minimum relative humidity, average relative humidity, maximum relative humidity, average wind speed, accumulated precipitation, and average solar radiation), either from Visual Crossing or from weather stations.
- -
- The second dataset covers the entire year 2023 in 12 day intervals and, in addition to the environmental variables, also includes satellite moisture.
- -
- The third dataset also covers the year 2023, but with records for each day of the year. In this case, only the environmental variables are included, without the satellite moisture. This third dataset is constructed as an intermediate degree between the previous two.
3.2.3. Statistical Analysis and Comparison Between Visual Crossing Data and Weather Stations Data
- -
- First of all, an exploratory data analysis was conducted. For this, the mean, median, standard deviation, and interquartile range of the data were calculated, and outliers were identified using box plots. For both types of data, and generally for all probes, the results were very similar.Next, the distributions followed by the different variables for each probe were analysed, which again, were very similar both for all probes and between the VC data and the station data. It is noteworthy in this case that none of the variables, in any data or probe, followed a normal distribution, as demonstrated by the Shapiro–Wilk test. This statistical test determines, for a small or medium number of records, as is the case here, whether the distribution followed by a variable is normal or not, based on whether a certain threshold (having a p-value greater than 0.05) is exceeded.
- -
- The next phase involved conducting a correlation analysis. Given that the previous step demonstrated that the variables did not follow normal distributions, Spearman’s correlation was employed. This method measures the statistical dependence between the ranks of two variables and is particularly useful when the relationships between variables are not linear but follow a monotonic trend. In most cases, a significant correlation was observed between the different variables and the moisture recorded by the probes, notably highlighting the inverse relationship between humidity and either ETo or solar radiation. It is also noteworthy that precipitation data obtained from Visual Crossing did not show a clear correlation with the dependent variable, unlike the data recorded by most of the stations. This discrepancy is due to the low resolution of the VC data, which averages data from three stations within a 50-km range, whereas the manually selected weather station is, at worst, less than 10 km away from its respective probe. These results are partially illustrated in Figure 4 and Figure 5. These figures represent average correlation matrices, showing the mean of all correlations between the different independent variables and the dependent variable ‘’. Thus, the values presented provide a general representation of each particular correlation matrix, which may be influenced by outlier correlation values from some of the probes.
- -
- The final phase consisted of detecting systematic errors in the data. This was carried out in three steps. First, a bias analysis was conducted by calculating the difference between the means of the station data and those calculated by VC. In this initial step, it was discovered that VC tended to slightly overestimate ETo, precipitation, and minimum temperature, and notably overestimate average wind speed and minimum relative humidity. Conversely, it perceptibly underestimated maximum relative humidity. The next step involved performing a residual analysis, which is the difference between the VC values and the station values, where the results obtained in the previous bias analysis were reaffirmed. Finally, Bland–Altman plots were used, which serve to visually compare two measurement techniques and to assess the agreement between two data sets.
3.2.4. Statistical Analysis of the Influence of the Satellite Moisture on the Dependent Variable
- -
- Firstly, a Spearman correlation test was conducted (it was previously verified that satellite moisture also did not follow a normal distribution). Here, it was observed that the data covering the period from 2022 to 2024 showed a slightly higher correlation with the dependent variable than the data covering the year 2023 daily, and that the latter had a higher correlation than the datasets covering 2023 in 12-day intervals. Additionally, it was found that the correlation of satellite moisture with the moisture measured by the probes was null or even significantly inverse. Nevertheless, it was retained in the second type of dataset explained in Section 3.2.2 to ensure that models trained using this additional variable performed worse than those using the other two types of datasets also explained in Section 3.2.2. These results are shown in the Figure 4 and Figure 5.
- -
- Next, a regression analysis was performed, as a good way to understand how the independent variables affect the dependent variable is to fit a multiple regression model and interpret the obtained coefficients, evaluating their statistical significance through p-values. The most notable observations from this analysis are, firstly, that as the time series of the data is reduced, the p-value associated with each variable increases significantly, making them less significant in the predictions. It is noteworthy that the average p-value associated with satellite moisture in the different regression models trained, both for VC and station data, is above 0.3. However, as the data time series is reduced, the R2 coefficient of the models increases. This could be explained by considering that we are using linear models to obtain an initial idea of the influences of the independent variables on the probe moisture, when the relationships between these variables do not have to be linear, especially as the volume of data increases, since these models are too simple and the relationships between the variables are complex.
- -
- Lastly, and especially since ETo was calculated from the other variables using the Penman–Monteith formula, a collinearity analysis was conducted using the Variance Inflation Factor (VIF), where it was observed that all datasets had variables with very high collinearity indices. To address this, Principal Component Analysis (PCA) was employed to extract the principal components that explained 95% of the variance of the dependent variable. However, after applying PCA, the linear models were repeated, resulting in a significant decrease in the R2 coefficient in all linear models, especially those trained on the 2023 dataset with satellite moisture. Therefore, all variables (except for satellite moisture) were left in the final datasets.
3.2.5. Development and Cross-Validation of the ANNs
3.2.6. Development and Cross-Validation of the GBRs
4. Results and Discussion
4.1. Main Results from the Statistical Analysis of the Data
- -
- The data collected by the weather stations showed a higher correlation with the SSM recorded by the probes, especially in the case of precipitation and average wind speed. This is due to the higher spatial resolution of the data taken by the weather stations and their closer proximity to the probes. The VC data consist of an average of the data collected by three weather stations in an area of 50 km around the point where the probe is located, and therefore, the data are more biased and inaccurate, as changes in the value of the environmental variables can be lost in the averaging process; the data collected by the manually-selected weather stations are never more than 10 kilometres away from the point where the probe is located, enhancing the accuracy of the data.
- -
- Regarding the influence of the satellite data on the dependent variable, it is worth noting that it was null and even inverse in terms of correlation; this seems to be due to the differences in depth and spatial resolution. While the satellite data cover wide regions of a square kilometre, probe data are collected locally, which are highly accurate but less general. Moreover, satellite data comprise the soil moisture in a vertical column that is 5 cm deep, while probes collect the soil moisture at a concrete point at 10 cm deep. Moreover, the temporal restrictions imposed by including this variable in the data significantly reduce the time series, and, therefore, the models’ ability to extract a better behaviour pattern in topsoil moisture, since it requires records every 12 days (due to spatial coverage restrictions) instead of making daily measurements. Furthermore, it was demonstrated that the data showed a higher correlation with the dependent variable the longer that the chosen time series was.
4.2. Model Performance Comparison
4.2.1. Implications of Feature Importance
- -
- For the ANN, feature importance is derived using permutation importance, where each feature is randomly shuffled to measure the degradation (or improvement) in model performance. This method can result in both positive and negative importance values. Positive values, as in evapotranspiration, minimum and maximum temperature, precipitation, and maximum, minimum, and average humidity, indicate that the features contribute positively to the model’s accuracy (i.e., by using these features, the model performance improves), while negative values (in average temperature, windspeed and solar radiation) suggest that permuting the feature paradoxically improves performance. These negative values are likely because the feature is noisy or irrelevant (especially in the case of average windspeed, as shown in Figure 4 and Figure 5, where the correlation of this variable with the soil moisture was null) or because the model might be overfitting on that feature.
- -
- In contrast, the GBR calculates feature importance based on the reduction in the loss function achieved when a feature is used to split data in its decision trees. This method inherently produces only positive values, as it aggregates the contributions of features to reducing the overall error, and, therefore, it primarily reflects direct contributions to prediction. In this case, Figure 8 shows that features in the GBR, such as maximum temperature or average humidity that were not as relevant for the ANN, are of great importance. This is due to the way GBR prioritises features that create effective splits in the data, capturing linear and monotonic relationships more effectively, being, therefore, contrary to the ANN, which is better in detecting complex, non-linear interactions.
4.2.2. Data Source Evaluation: Weather Station Data vs. Visual Crossing Data
4.3. Surface Soil Moisture Prediction: Comparison of Forecasted and Actual Data
4.4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lianos, T.P. Population and Environment. In Capitalism, Degrowth and the Steady State Economy: Debating Future Economic Models; Springer: Berlin/Heidelberg, Germany, 2024; pp. 45–56. [Google Scholar] [CrossRef]
- Wang, J.; Azam, W. Natural resource scarcity, fossil fuel energy consumption, and total greenhouse gas emissions in top emitting countries. Geosci. Front. 2024, 15, 101757. [Google Scholar] [CrossRef]
- da Silveira, L.H.M.; Cataldi, M.; de Farias, W.C.M. Development of multi-scale indices of human mobility restriction during the COVID-19 based on air quality from local and global NO2 concentration. iScience 2023, 26, 107599. [Google Scholar] [CrossRef]
- Yunus, A.P.; Masago, Y.; Hijioka, Y. COVID-19 and surface water quality: Improved lake water quality during the lockdown. Sci. Total Environ. 2020, 731, 139012. [Google Scholar] [CrossRef] [PubMed]
- Programme, U.W.W.A.; Koncagül, E.; Connor, R.; Abete, V. The United Nations World Water Development Report 2024: Water for Prosperity and Peace; Facts, Figures and Action Examples; UNESCO Publishing: Paris, France, 2024. [Google Scholar]
- Tolba, M.K. Recursos de agua dulce y calidad del agua. In Salvemos El Planeta: Problemas y Esperanzas; Springer: Dordrecht, The Netherlands, 1992; pp. 45–56. [Google Scholar] [CrossRef]
- Eze, J.N.; Salihu, B.Z.; Isong, A.; Aliyu, U.; Ibrahim, P.A.; Gbanguba, A.U.; Ayanniyi, N.N.; Alfa, N.; Alfa, M.; Aremu, P.A.; et al. Climate Change Impact on Agriculture and Water Resources—A Review. Badeggi J. Agric. Res. Environ. 2022, 4, 43–53. [Google Scholar] [CrossRef]
- Augusto Getirana, R.L.; Cataldi, M. Brazil is in water crisis—It needs a drought plan. Nature 2021, 600, 218–220. [Google Scholar] [CrossRef]
- Srivastav, A.L.; Dhyani, R.; Ranjan, M.; Madhav, S.; Sillanpää, M. Climate-resilient strategies for sustainable management of water resources and agriculture. Environ. Sci. Pollut. Res. 2021, 28, 41576–41595. [Google Scholar] [CrossRef] [PubMed]
- Singh, V.P.; Mishra, A.K.; Chowdhary, H.; Khedun, C.P. Climate change and its impact on water resources. In Modern Water Resources Engineering; Humana Press: Totowa, NJ, USA, 2013; pp. 525–569. [Google Scholar] [CrossRef]
- Kanae, S. Global warming and the water crisis. J. Health Sci. 2009, 55, 860–864. [Google Scholar] [CrossRef]
- Lorenzo-Lacruz, J.; Garcia, C.; Morán-Tejeda, E. Groundwater level responses to precipitation variability in Mediterranean insular aquifers. J. Hydrol. 2017, 552, 516–531. [Google Scholar] [CrossRef]
- Guerrero-Baena, M.; Gómez-Limón, J. Insuring Water Supply in Irrigated Agriculture: A Proposal for Hydrological Drought Index-Based Insurance in Spain. Water 2019, 11, 686. [Google Scholar] [CrossRef]
- O’Neill, M.P.; Michael, D.P. Water and agriculture in a changing climate. HortScience 2011, 46, 155–157. [Google Scholar] [CrossRef]
- Sraïri, M. IWater uses in sustainable agriculture practices: Reconsidering the priorities in water scarce areas. Adv. Plants Agric. Res. 2018, 8, 333–334. [Google Scholar] [CrossRef]
- Du Plessis, A.; du Plessis, A. Current and future water scarcity and stress. In Water as an Inescapable Risk: Current Global Water Availability, Quality and Risks with a Specific Focus on South Africa; Springer: Cham, Switzerland, 2018; pp. 13–25. [Google Scholar] [CrossRef]
- Kotze, H.C.; Qotoyi, M.S.; Bahta, Y.T.; Jordaan, H.; Monteiro, M.A. A Systematic Review and Meta-Analysis of Factors Influencing Water Use Behaviour and the Efficiency of Agricultural Production in South Africa. Resources 2024, 13, 94. [Google Scholar] [CrossRef]
- Martínez-Alvarez, V.; González-Ortega, M.; Martin-Gorriz, B.; Soto-García, M.; Maestre-Valero, J. The use of desalinated seawater for crop irrigation in the Segura River Basin (south-eastern Spain). Desalination 2017, 422, 153–164. [Google Scholar] [CrossRef]
- Mehta, P.; Siebert, S.; Kummu, M.; Deng, Q.; Ali, T.; Marston, L.; Xie, W.; Davis, K.F. Half of twenty-first century global irrigation expansion has been in water-stressed regions. Nat. Water 2024, 2, 254–261. [Google Scholar] [CrossRef]
- Gu, L.; Chen, J.; Yin, J.; Sullivan, S.C.; Wang, H.; Guo, S.; Zhang, L.; Kim, J. Projected increases in magnitude and socioeconomic exposure of global droughts in 1.5 and 2 °C warmer climates. Hydrol. Earth Syst. Sci. 2020, 24, 451–472. [Google Scholar] [CrossRef]
- Schleussner, C.; Lissner, T.K.; Fischer, E.M.; Wohland, J.; Perrette, M.; Golly, A.; Rogelj, J.; Childers, K.; Schewe, J.; Frieler, K.; et al. Differential climate impacts for policy-relevant limits to global warming: The case of 1.5 °C and 2 °C. Earth Syst. Dyn. 2016, 7, 327–351. [Google Scholar] [CrossRef]
- Cramer, W.; Guiot, J.; Fader, M.; Garrabou, J.; Gattuso, J.; Iglesias, A.; Lange, M.A.; Lionello, P.; Llasat, M.C.; Paz, S.; et al. Climate change and interconnected risks to sustainable development in the Mediterranean. Nat. Clim. Change 2018, 8, 972–980. [Google Scholar] [CrossRef]
- Wagner, W.; Dorigo, W.; de Jeu, R.; Fernandez, D.; Benveniste, J.; Haas, E.; Ertl, M. Fusion of Active and Passive Microwave Observations to Create an Essential Climate Variable Data Record on Soil Moisture. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, I-7, 315–321. [Google Scholar] [CrossRef]
- Stańczyk, T.; Kasperska-Wołowicz, W.; Szatyłowicz, J.; Gnatowski, T.; Papierowska, E. Surface Soil Moisture Determination of Irrigated and Drained Agricultural Lands with the OPTRAM Method and Sentinel-2 Observations. Remote Sens. 2023, 15, 5576. [Google Scholar] [CrossRef]
- World Meteorological Organization. State of Global Water Resources 2022; United Nations: New York, NY, USA, 2023; pp. 14–15. [Google Scholar] [CrossRef]
- de Fraiture, C.; Smakhtin, V.; Bossio, D.; McCornick, P.; Hoanh, C.; Noble, A.; Molden, D.; Gichuki, F.; Giordano, M.; Finlayson, M.; et al. Facing climate change by securing water for food, livelihoods and ecosystems. J. Semi-Arid Trop. Agric. Res. 2007, 4, 21. [Google Scholar]
- ElSaadani, M.; Habib, E.; Abdelhameed, A.M.; Bayoumi, M. Assessment of a spatiotemporal deep learning approach for soil moisture prediction and filling the gaps in between soil moisture observations. Front. Artif. Intell. 2021, 4, 636234. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Y.; Zhang, R.; Sun, B.; Wang, T.; Zhang, B.; Tu, J.; Nie, S.; Jiang, H.; Chen, K. GNSS-IR Soil Moisture Retrieval Using Multi-Satellite Data Fusion Based on Random Forest. Remote Sens. 2024, 16, 3428. [Google Scholar] [CrossRef]
- Wilson, M.; Datta, R.; Savarimuthu, S.; Moller, D.; Ruf, C. Prediction of Soil Moisture From Near-Global Cygnss Gnss-Reflectometry Using a Random Forest Machine Learning Model. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 4465–4471. [Google Scholar] [CrossRef]
- Taihuttu, H.Y.; Sitanggang, I.S.; Syaufina, L. Soil Moisture Prediction Model in Peatland Using Random Forest Regressor. BAREKENG J. Ilmu Mat. Dan Terap. 2024, 18, 2505–2516. [Google Scholar] [CrossRef]
- Brakhasi, F.; Walker, J.P.; Judge, J.; Liu, P.W.; Shen, X.; Ye, N.; Wu, X.; Yeo, I.Y.; Prajapati, R.; Kim, E.; et al. Multi-Layer Soil Moisture Estimation Using Combined L-and P-Band Radiometry: An Application of Machine Learning Algorithms. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 4411–4415. [Google Scholar] [CrossRef]
- Li, Q.; Zhu, Y.; Shangguan, W.; Wang, X.; Li, L.; Yu, F. An attention-aware LSTM model for soil moisture and soil temperature prediction. Geoderma 2022, 409, 115651. [Google Scholar] [CrossRef]
- Li, X.; Zhang, Z.; Li, Q.; Zhu, J. Enhancing Soil Moisture Forecasting Accuracy with REDF-LSTM: Integrating Residual En-Decoding and Feature Attention Mechanisms. Water 2024, 16, 1376. [Google Scholar] [CrossRef]
- Jayasinghe, W.; Deo, R.C.; Raj, N.; Ghimire, S.; Yaseen, Z.M.; Nguyen-Huy, T.; Ghahramani, A. Forecasting Multi-Step Soil Moisture with Three-Phase Hybrid Wavelet-Least Absolute Shrinkage Selection Operator-Long Short-Term Memory Network (moDWT-Lasso-LSTM) Model. Water 2024, 16, 3133. [Google Scholar] [CrossRef]
- Zhou, G.; Li, G. Forecasting Soil Moisture Using PSO-CNN-LSTM Model. In Proceedings of the 2024 IEEE Congress on Evolutionary Computation (CEC), Yokohama, Japan, 30 June–5 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7. [Google Scholar] [CrossRef]
- Martínez, M.Z.; Da Silveira, L.H.M.; Marin-Perez, R.; Gomez, A.F.S. Development of a Neural Network System for Predicting Topsoil Moisture Using Remote Sensing and Rainfall Forecast Data. In Proceedings of the 2024 4th International Conference on Embedded & Distributed Systems (EDiS), Bechar, Algeria, 3–5 November 2024; pp. 249–254. [Google Scholar] [CrossRef]
- Hrushikesh, R.; Pathak, A.A.; Punithraj, G. Quantifying Surface Soil Moisture Variability Through Synergistic Applications of SAR and Machine Learning Techniques. In Proceedings of the 2023 IEEE India Geoscience and Remote Sensing Symposium (InGARSS), Bangalore, India, 10–13 December 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–4. [Google Scholar] [CrossRef]
- Singh, A.; Gaurav, K. A physics-informed machine learning approach to estimate surface soil moisture. In Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria, 24–28 April 2023. [Google Scholar] [CrossRef]
- Guan, J.; Bragdon, S.; Clausen, J. Predicting Soil Moisture Content Using Physics-Informed Neural Networks (PINNs); Engineer Research and Development Center: Vicksburg, MS, USA, 2024. [Google Scholar] [CrossRef]
- Singh, A.; Gaurav, K. Deep learning and data fusion to estimate surface soil moisture from multi-sensor satellite images. Sci. Rep. 2023, 13, 2251. [Google Scholar] [CrossRef] [PubMed]
- Hassan-Esfahani, L.; Torres-Rua, A.; Jensen, A.; McKee, M. Assessment of surface soil moisture using high-resolution multi-spectral imagery and artificial neural networks. Remote Sens. 2015, 7, 2627–2646. [Google Scholar] [CrossRef]
- Liu, J.; Shen, C.; Rahmani, F.; Lawson, K. A multiscale deep learning model integrating satellite-based and in-situ data for high-resolution soil moisture predictions. In Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria, 23–28 April 2023. [Google Scholar] [CrossRef]
- Ma, Z.; Wu, B.; Chang, S.; Yan, N.; Zhu, W. Developing a physics-guided neural network to predict soil moisture with remote sensing evapotranspiration and weather forecasting. In Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria, 24–28 April 2023. [Google Scholar] [CrossRef]
- Meenakshi, M.; Naresh, R. Prediction of soil moisture root zone health in Artificial Neural Network. In Proceedings of the 2021 4th International Conference on Recent Trends in Computer Science and Technology (ICRTCST), Jamshedpur, India, 11–12 February 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–5. [Google Scholar] [CrossRef]
- Bischof, B.; Zehe, E.; Loritz, R. Using neural networks for predicting soil water storage based in situ soil moisture observations. techreport. In Proceedings of the Copernicus Meetings, Vienna, Austria, 14–19 April 2024. [Google Scholar] [CrossRef]
- Wang, Y.; Shi, L.; Hu, Y.; Hu, X.; Song, W.; Wang, L. A comprehensive study of deep learning for soil moisture prediction. Hydrol. Earth Syst. Sci. 2024, 28, 917–943. [Google Scholar] [CrossRef]
- Li, L.; Dai, Y.; Shangguan, W.; Wei, N.; Wei, Z.; Gupta, S. Multistep forecasting of soil moisture using spatiotemporal deep encoder–decoder networks. J. Hydrometeorol. 2022, 23, 337–350. [Google Scholar] [CrossRef]
- Grubišić, V.; Vasić, D.; Ljubić, H.; Rozić, R.; Volarić, T. Soil Moisture Prediction with Attention-Enhanced Models: A Deep Learning Approach. Authorea Prepr. 2024. [Google Scholar] [CrossRef]
- Wang, G.; Wei, C.; Yan, L.; Li, J. Soil Moisture Prediction Model Based on Improved GRU Recurrent Neural Network. Strateg. Plan. Energy Environ. 2024, 43, 381–400. [Google Scholar] [CrossRef]
- Islam, M.N.; Logofatu, D.; Haque, M.Z. A Comparative Study on Machine Learning Methods Through Evaluating the Impact of Contributing Factors on The Accuracy of Soil Moisture Prediction. In Proceedings of the 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), Hammamet, Tunisia, 20–23 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Divya, A.; Josphineleela, R.; Sheela, L.J. A machine learning based approach for prediction and interpretation of soil properties from soil spectral data. J. Environ. Biol. 2024, 45, 96–105. [Google Scholar] [CrossRef]
- Ren, Y.; Ling, F.; Wang, Y. Research on Provincial-Level Soil Moisture Prediction Based on Extreme Gradient Boosting Model. Agriculture 2023, 13, 927. [Google Scholar] [CrossRef]
- Zhu, Y.; Jing, X.; Ding, A. Prediction of soil moisture in Inner Mongolia’s League based on machine learning. In Proceedings of the Fourth International Conference on Signal Processing and Computer Science (SPCS 2023), Guilin, China, 25–27 August 2023; SPIE: Bellingham, WA, USA, 2023; Volume 12970, pp. 17–20. [Google Scholar] [CrossRef]
- Nguyen, T.T.; Ngo, H.H.; Guo, W.; Chang, S.W.; Nguyen, D.D.; Nguyen, C.T.; Zhang, J.; Liang, S.; Bui, X.T.; Hoang, N.B. A low-cost approach for soil moisture prediction using multi-sensor data and machine learning algorithm. Sci. Total Environ. 2022, 833, 155066. [Google Scholar] [CrossRef]
- Jamei, M.; Ali, M.; Karbasi, M.; Sharma, E.; Jamei, M.; Chu, X.; Yaseen, Z.M. A high dimensional features-based cascaded forward neural network coupled with MVMD and Boruta-GBDT for multi-step ahead forecasting of surface soil moisture. Eng. Appl. Artif. Intell. 2023, 120, 105895. [Google Scholar] [CrossRef]
- Cheng, Y.; Li, Y.; Wu, H.; Li, F.; Li, Y.; He, L. Soil Moisture Retrieval Using Stacked Generalization: An Ensemble Machine Learning Method. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 6984–6987. [Google Scholar] [CrossRef]
- Han, Q.; Zeng, Y.; Zhang, L.; Cira, C.I.; Prikaziuk, E.; Duan, T.; Wang, C.; Szabó, B.; Manfreda, S.; Zhuang, R.; et al. Ensemble of optimised machine learning algorithms for predicting surface soil moisture content at a global scale. Geosci. Model Dev. 2023, 16, 5825–5845. [Google Scholar] [CrossRef]
- Li, X.; Wu, J.; Yu, J.; Zhou, Z.; Wang, Q.; Zhao, W.; Hu, L. Inversion of Soil Moisture Content in Cotton Fields Using GBR-RF Algorithm Combined with Sentinel-2 Satellite Spectral Data. Agronomy 2024, 14, 784. [Google Scholar] [CrossRef]
- Sharma, S.; Singh, G. Cultivating Precision: Integrating XGBoost Imputation with Random Forest Regression for Accurate Soil Moisture Prediction. In Proceedings of the 2024 10th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 14–15 March 2024; IEEE: Piscataway, NJ, USA, 2024; Volume 1, pp. 156–161. [Google Scholar] [CrossRef]
- Kumar, A.; Kaushik, K.; Singh, G. Predicting Soil Moisture Levels Using Ensemble Machine Learning Methods. In Proceedings of the 2024 10th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 14–15 March 2024; IEEE: Piscataway, NJ, USA, 2024; Volume 1, pp. 127–132. [Google Scholar] [CrossRef]
- Acharya, U.; Daigh, A.L.; Oduor, P.G. Machine learning for predicting field soil moisture using soil, crop, and nearby weather station data in the Red River Valley of the North. Soil Syst. 2021, 5, 57. [Google Scholar] [CrossRef]
- Technologies, S. Sentek SDI-12 Series II Manual (Ver 1.1). 2022. Available online: https://www.sentektechnologies.com/manuals/sdi-12-series-ii (accessed on 27 March 2024).
- IMIDA. Informe Agrometeorológico Personalizado. 2024. Available online: http://siam.imida.es/apex/f?p=101:46:344545242704833::::: (accessed on 25 March 2024).
- Department of Geography of the University of Murcia. SUREMET. 2024. Available online: https://suremet.es/index.php (accessed on 25 March 2024).
- Department of Geography of the University of Murcia. Proyecto FrostSE. 2021. Available online: https://frostsureste.wordpress.com/proyecto (accessed on 25 March 2024).
- IFAPA. Datos de la Estación|Instituto de Investigación y Formación Agraria y Pesquera (IFAPA). 2024. Available online: https://www.juntadeandalucia.es/agriculturaypesca/ifapa/riaweb/web/estacion/18/2 (accessed on 25 March 2024).
- Visual Crossing. Visual Crossing Weather API. 2020. Available online: https://www.visualcrossing.com/weather-api (accessed on 16 February 2025).
- Visual Crossing. FAQs for Visual Crossing Weather Data. 2020. Available online: https://www.visualcrossing.com/resources/documentation/weather-data/frequently-asked-questions-faq-for-visual-crossing-weather-data/ (accessed on 16 February 2025).
- Visual Crossing. Available Data for Visual Crossing Weather. 2020. Available online: https://www.visualcrossing.com/resources/documentation/weather-data/available-data-for-visual-crossing-weather-updated-january-2020/ (accessed on 16 February 2025).
- Visual Crossing. Weather Data Documentation for Visual Crossing. 2023. Available online: https://www.visualcrossing.com/resources/documentation/weather-data/weather-data-documentation/ (accessed on 16 February 2025).
- Han, Y.; Calabrese, S.; Du, H.; Yin, J. Evaluating biases in Penman and Penman–Monteith evapotranspiration rates at different timescales. J. Hydrol. 2024, 638, 131534. [Google Scholar] [CrossRef]
- Cárdenas, O.L.; Gastélum, R.D.E.; Campos, M.N.; Galaviz, R.E.P.; Serrano, L.A.G.; Montoya, J.M. Penman–Monteith Reference Evapotranspiration Estimation Models, Using Latitude–Temperature Data, in the State of Sinaloa, Mexico. Preprints 2024. [Google Scholar] [CrossRef]
- Wang, S.C. Artificial neural network. Interdiscip. Comput. Java Program. 2003, 743, 81–100. [Google Scholar] [CrossRef]
- Rau, K.; Eggensperger, K.; Schneider, F.; Hennig, P.; Scholten, T. How can we quantify, explain, and apply the uncertainty of complex soil maps predicted with neural networks? Sci. Total Environ. 2024, 944, 173720. [Google Scholar] [CrossRef] [PubMed]
- Acharjee, P.; Souliman, M.; Isied, M. Artificial neural network-based prediction model for soil-water characteristics curve coefficients from soil index properties. In Bituminous Mixtures and Pavements VIII; CRC Press: Boca Raton, FL, USA, 2024; p. 129. [Google Scholar] [CrossRef]
- Pacci, S.; Dengiz, O.; Alaboz, P.; Saygın, F. Artificial neural networks in soil quality prediction: Significance for sustainable tea cultivation. Sci. Total Environ. 2024, 947, 174447. [Google Scholar] [CrossRef] [PubMed]
- Uzer, A.U. Efficient prediction of compressive strength in geotechnical engineering using artificial neural networks. Turk. J. Eng. 2024, 8, 457–468. [Google Scholar] [CrossRef]
- Elakiya, N.; Keerthana, G. Application of Artificial Neural Networks in Soil Science Research. Arch. Curr. Res. Int. 2024, 24, 1–15. [Google Scholar] [CrossRef]
- Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Hu, H.; Sun, W.; Venkatraman, A.; Hebert, M.; Bagnell, A. Gradient boosting on stochastic data streams. In Proceedings of the Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 20–22 April 2017; Singh, A., Zhu, J., Eds.; PMLR, Proceedings of Machine Learning Research. pp. 595–603. [Google Scholar]
- CARM. Data of the Region of Murcia. 2024. Available online: https://www.carm.es/web/pagina?IDCONTENIDO=1619&IDTIPO=100&RASTRO=c$m25987,127,1604 (accessed on 21 February 2024).
- CARM. Geographic Data of the Region of Murcia. 2024. Available online: https://www.carm.es/web/pagina?IDCONTENIDO=1613&IDTIPO=100&RASTRO=c\protect\T1\textdollarm25987,127,1604 (accessed on 17 February 2024).
- topographic map.com. Topographic Map of the Region of Murcia, Altitude, Relief. 2024. Available online: https://es-es.topographic-map.com/map-7lkf3/Regi%C3%B3n-de-Murcia (accessed on 9 October 2024).
- Chepino, B.G.; Yacoub, R.R.; Aula, A.; Saleh, M.; Sanjaya, B.W. Effect of MinMax Normalization on ORB Data for Improved ANN Accuracy. J. Electr. Eng. Energy Inf. Technol. (J3EIT) 2023, 11, 29–35. [Google Scholar] [CrossRef]
- Shantal, M.; Othman, Z.; Bakar, A.A. A novel approach for data feature weighting using correlation coefficients and min–max normalization. Symmetry 2023, 15, 2185. [Google Scholar] [CrossRef]
- Huang, L. Motivation and Overview of Normalization in DNNs. In Normalization Techniques in Deep Learning; Springer: Cham, Switzerland, 2022; pp. 11–18. [Google Scholar] [CrossRef]
- Nawi, N.M.; Atomi, W.H.; Rehman, M.Z. The effect of data pre-processing on optimized training of artificial neural networks. Procedia Technol. 2013, 11, 32–39. [Google Scholar] [CrossRef]
- Liew, S.S.; Khalil-Hani, M.; Bakhteri, R. Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems. Neurocomputing 2016, 216, 718–734. [Google Scholar] [CrossRef]
- Upadhyay, D.; Manero, J.; Zaman, M.; Sampalli, S. Gradient boosting feature selection with machine learning classifiers for intrusion detection on power grids. IEEE Trans. Netw. Serv. Manag. 2020, 18, 1104–1116. [Google Scholar] [CrossRef]
- Shekar, B.; Dagnew, G. Grid search-based hyperparameter tuning and classification of microarray cancer data. In Proceedings of the 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), Gangtok, India, 25–28 February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Abas, M.A.H.; Ismail, N.; Ali, N.; Tajuddin, S.; Tahir, N.M. Agarwood oil quality classification using support vector classifier and grid search cross validation hyperparameter tuning. Int. J. Emerg. Trends Eng. Res. 2020, 8, 2551–2556. [Google Scholar] [CrossRef]
- Sah, S.; Surendiran, B.; Dhanalakshmi, R.; Yamin, M. COVID-19 cases prediction using SARIMAX Model by tuning hyperparameter through grid search cross-validation approach. Expert Syst. 2023, 40, e13086. [Google Scholar] [CrossRef] [PubMed]
- Prakash, S.; Sharma, A.; Sahu, S.S. Soil moisture prediction using machine learning. In Proceedings of the 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 20–21 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Adab, H.; Morbidelli, R.; Saltalippi, C.; Moradian, M.; Ghalhari, G.A.F. Machine learning to estimate surface soil moisture from remote sensing data. Water 2020, 12, 3223. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zambudio Martínez, M.; Silveira, L.H.M.d.; Marin-Perez, R.; Gomez, A.F.S. Development and Comparison of Artificial Neural Networks and Gradient Boosting Regressors for Predicting Topsoil Moisture Using Forecast Data. AI 2025, 6, 41. https://doi.org/10.3390/ai6020041
Zambudio Martínez M, Silveira LHMd, Marin-Perez R, Gomez AFS. Development and Comparison of Artificial Neural Networks and Gradient Boosting Regressors for Predicting Topsoil Moisture Using Forecast Data. AI. 2025; 6(2):41. https://doi.org/10.3390/ai6020041
Chicago/Turabian StyleZambudio Martínez, Miriam, Larissa Haringer Martins da Silveira, Rafael Marin-Perez, and Antonio Fernando Skarmeta Gomez. 2025. "Development and Comparison of Artificial Neural Networks and Gradient Boosting Regressors for Predicting Topsoil Moisture Using Forecast Data" AI 6, no. 2: 41. https://doi.org/10.3390/ai6020041
APA StyleZambudio Martínez, M., Silveira, L. H. M. d., Marin-Perez, R., & Gomez, A. F. S. (2025). Development and Comparison of Artificial Neural Networks and Gradient Boosting Regressors for Predicting Topsoil Moisture Using Forecast Data. AI, 6(2), 41. https://doi.org/10.3390/ai6020041