# Performance of Machine Learning Techniques for Meteorological Drought Forecasting in the Wadi Mina Basin, Algeria

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{8}

^{9}

^{10}

^{11}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Description of the Study Area

^{2}and is located between 00°22′59″ and 01°09′02″ east, as well as between 34°41′57″ and 35°35′27″ north (Figure 1). It has four significant tributaries: Wadi Haddad, Wadi Abd, Wadi Mina, and Wadi Taht. The elevation varies between 164 and 1327 m. The topography of the basin is complex and uneven. The study region features a continental climate with dramatic seasonal temperature variations, such as bitterly cold winters and sweltering summers. The yearly precipitation averages between 200 and 500 mm, with most of it falling between November and March. The average yearly temperature is between 16 and 19.5 °C. Over half of the basin is covered in a variety of plant types, including 32% scrubs, 35.8% woods, and cereal crops [21]. Monthly rainfall and runoff records are available for five rainfall and hydrometric stations over 40 years (1974–2009) (Figure 1 and Table 1).

#### 2.2. Standardized Precipitation Index (SPI)

#### 2.3. Machine Learning Models

#### 2.3.1. Support Vector Machine (SVM)

_{i}and Y

_{i}are the independent and dependent variables, respectively; ${\beta}_{0}$ is constant; ${\beta}_{n}$ is the slope coefficient of each X

_{i}; and $\u03f5$ is the model error term or residuals.

#### 2.3.2. Additive Regression (AR)

^{2}). Furthermore, a sum of m regression trees is used, i.e., f(x) = ∑ g(x; T

_{j}, M

_{j}), ranging between j = 1 and j = m, which allows for the estimation of f(x). BART is expressed in Equation (11):

#### 2.3.3. Bagging

_{m}, m = 1, …, M, which were then merged into one class. As a result, the weight of this class was derived from the combined weight of the individual predictor classes, as per Equation (12):

_{m}was determined as m = 1, …, M, then more accurate classifications would have a greater effect than less accurate classifications. As the weak Hm classification was slightly more accurate than the random classification [29], the latter was referred to as the weak H

_{m}classification. The input datasets were also modeled using regression trees. The uniqueness of each tree was based on its ability to forecast changes in the training dataset. As a final step, the weighted average of each regression tree’s projections was calculated.

#### 2.3.4. Random Subspace (RSS)

#### 2.3.5. Random Forest (RF)

## 3. Model Evaluation

## 4. Results and Discussion

#### 4.1. Input Variables Selection

#### 4.2. Comparison Models of SPI Drought Index

## 5. Conclusions and Recommendations

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Dai, A. Increasing drought under global warming in observations and models. Nat. Clim. Chang.
**2013**, 3, 52–58. [Google Scholar] [CrossRef] - Kim, T.W.; Jehanzaib, M. Drought risk analysis, forecasting and assessment under climate change. Water
**2020**, 12, 1862. [Google Scholar] [CrossRef] - Zhao, L.; Lyu, A.; Wu, J.; Hayes, M.; Tang, Z.; He, B.; Liu, J.; Liu, M. Impact of meteorological drought on streamflow drought in Jinghe River Basin of China. Chin. Geogr. Sci.
**2014**, 24, 694–705. [Google Scholar] [CrossRef] - Crutchfield, S. USDA Economic Research Service-US Drought 2012: Farm and Food Impacts. 2012. Available online: https://drought.unl.edu/archive/assessments/USDA-ERS-2012-farm-food-impacts.pdf (accessed on 8 December 2022).
- Mishra, A.K.; Singh, V.P. Drought modelling—A review. J. Hydrol.
**2011**, 403, 157–175. [Google Scholar] [CrossRef] - Cancelliere, A.; Mauro, G.D.; Bonaccorso, B.; Rossi, G. Drought forecasting using the standardized precipitation index. Water Resour. Manag.
**2007**, 21, 801–819. [Google Scholar] [CrossRef] - Heim, R.R., Jr. A review of twentieth-century drought indices used in the United States. Bull. Am. Meteorol. Soc.
**2002**, 83, 1149–1166. [Google Scholar] [CrossRef] - Jehanzaib, M.; Shah, S.A.; Kim, J.E.; Kim, T.W. Exploring spatio-temporal variation of drought characteristics and propagation under climate change using multi-model ensemble projections. Nat. Hazards
**2022**, 1–21. [Google Scholar] [CrossRef] - Jehanzaib, M.; Sattar, M.N.; Lee, J.H.; Kim, T.W. Investigating effect of climate change on drought propagation from meteorological to hydrological drought using multi-model ensemble projections. Stoch. Environ. Res. Risk Assess.
**2020**, 34, 7–21. [Google Scholar] [CrossRef] - Zhao, J.; Xu, J.; Xie, X.; Lu, H. Drought monitoring based on TIGGE and distributed hydrological model in Huaihe River Basin, China. Sci. Total Environ.
**2016**, 553, 358–365. [Google Scholar] [CrossRef] - Durbach, I.; Merven, B.; McCall, B. Expert elicitation of autocorrelated time series with application to e3 (energy-environment-economic) forecasting models. Environ. Model. Softw.
**2017**, 88, 93–105. [Google Scholar] [CrossRef] - Jehanzaib, M.; Bilal Idrees, M.; Kim, D.; Kim, T.W. Comprehensive evaluation of machine learning techniques for hydrological drought forecasting. J. Irrig. Drain. Eng.
**2021**, 147, 04021022. [Google Scholar] [CrossRef] - Anshuka, A.; van Ogtrop, F.F.; Willem Vervoort, R. Drought forecasting through statistical models using standardised precipitation index: A systematic review and meta-regression analysis. Nat. Hazards
**2019**, 97, 955–977. [Google Scholar] [CrossRef] - Solomatine, D.P.; Ostfeld, A. Data-driven modelling: Some past experiences and new approaches. J. Hydroinformatics
**2008**, 10, 3–22. [Google Scholar] [CrossRef] - Abrahart, R.J.; See, L.M.; Solomatine, D.P. (Eds.) Practical Hydroinformatics: Computational Intelligence and Technological Developments in Water Applications; Springer Science & Business Media: Berlin, Germany, 2008; Volume 68. [Google Scholar]
- Achite, M.; Jehanzaib, M.; Elshaboury, N.; Kim, T.W. Evaluation of machine learning techniques for hydrological drought modeling: A case study of the Wadi Ouahrane basin in Algeria. Water
**2022**, 14, 431. [Google Scholar] [CrossRef] - Maca, P.; Pech, P. Forecasting SPEI and SPI drought indices using the integrated artificial neural networks. Comput. Intell. Neurosci.
**2016**, 2016, 14. [Google Scholar] [CrossRef] - Mokhtarzad, M.; Eskandari, F.; Jamshidi Vanjani, N.; Arabasadi, A. Drought forecasting by ANN, ANFIS, and SVM and comparison of the models. Environ. Earth Sci.
**2017**, 76, 729. [Google Scholar] [CrossRef] - Sattar, M.N.; Jehanzaib, M.; Kim, J.E.; Kwon, H.H.; Kim, T.W. Application of the hidden Markov bayesian classifier and propagation concept for probabilistic assessment of meteorological and hydrological droughts in South Korea. Atmosphere
**2020**, 11, 1000. [Google Scholar] [CrossRef] - Adnan, R.M.; Mostafa, R.R.; Islam, A.R.M.T.; Gorgij, A.D.; Kuriqi, A.; Kisi, O. Improving drought modeling using hybrid random vector functional link methods. Water
**2021**, 13, 3379. [Google Scholar] [CrossRef] - Achite, M.; Ouillon, S. Suspended sediment transport in a semiarid watershed, Wadi Abd, Algeria (1973–1995). J. Hydrol.
**2007**, 343, 187–202. [Google Scholar] [CrossRef] - Awange, J.L.; Mpelasoka, F.; Goncalves, R.M. When every drop counts: Analysis of droughts in Brazil for the 1901-2013 period. Sci. Total Environ.
**2016**, 566, 1472–1488. [Google Scholar] [CrossRef] - Sain, S.R. The nature of statistical learning theory. Technometrics
**1996**, 38, 409. [Google Scholar] [CrossRef] - Kushwaha, N.L.; Rajput, J.; Elbeltagi, A.; Elnaggar, A.Y.; Sena, D.R.; Vishwakarma, D.K.; Mani, I.; Hussein, E.E. Data intelligence model and meta-heuristic algorithms-based pan evaporation modelling in two different agro-climatic zones: A case study from Northern India. Atmosphere
**2021**, 12, 1654. [Google Scholar] [CrossRef] - Friedman, J.H. Stochastic gradient boosting. Comput. Stat. Data Anal.
**2002**, 38, 367–378. [Google Scholar] [CrossRef] - Tan, Y.V.; Roy, J. Bayesian additive regression trees and the General BART model. Stat. Med.
**2019**, 38, 5048–5069. [Google Scholar] [CrossRef] [PubMed] - Sparapani, R.; Logan, B.; Laud, P. MCW Biostatistics Technical Report 72 Nonparametric Failure Time: Time-to-event Machine Learning with Heteroskedastic Bayesian Additive Regression Trees and Low Information Omnibus Dirichlet Process Mixtures. 2021. Available online: https://www.mcw.edu/-/media/MCW/Departments/Biostatistics/tr72.pdf?la=en (accessed on 8 December 2022).
- Breiman, L. Bagging predictors. Mach. Learn.
**1996**, 24, 123–140. [Google Scholar] [CrossRef] - Breiman, L. Arcing the Edge; Technical Report 486; Statistics Department, University of California at Berkeley: Berkeley, CA, USA, 1997. [Google Scholar]
- Vishwakarma, D.K.; Ali, R.; Bhat, S.A.; Elbeltagi, A.; Kushwaha, N.L.; Kumar, R.; Rajput, J.; Heddam, S.; Kuriqi, A. Pre-and post-dam river water temperature alteration prediction using advanced machine learning models. Environ. Sci. Pollut. Res.
**2022**, 29, 83321–83346. [Google Scholar] [CrossRef] - Al-rimy, B.A.S.; Maarof, M.A.; Shaid, S.Z.M. Crypto-ransomware early detection model using novel incremental bagging with enhanced semi-random subspace selection. Future Gener. Comput. Syst.
**2019**, 101, 476–491. [Google Scholar] [CrossRef] - Plumpton, C.O.; Kuncheva, L.I.; Oosterhof, N.N.; Johnston, S.J. Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data. Pattern Recognit.
**2012**, 45, 2101–2108. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Deo, R.C.; Tiwari, M.K.; Adamowski, J.F.; Quilty, J.M. Forecasting effective drought index using a wavelet extreme learning machine (W-ELM) model. Stoch. Environ. Res. Risk Assess.
**2017**, 31, 1211–1240. [Google Scholar] [CrossRef] - Mohamadi, S.; Sammen, S.S.; Panahi, F.; Ehteram, M.; Kisi, O.; Mosavi, A.; Ahmed, A.N.; El-Shafie, A.; Al-Ansari, N. Zoning map for drought prediction using integrated machine learning models with a nomadic people optimization algorithm. Nat. Hazards
**2020**, 104, 537–579. [Google Scholar] [CrossRef]

**Figure 7.**Plots showing predicted and observed SPI values for 3, 6, 9, and 12 timeframes during training and testing periods at sub-basin 1.

**Figure 9.**Plots showing the predicted and observed SPI values for 3-, 6-, 9-, and 12-month timeframes during the cross-validation periods at sub-basin 1.

**Figure 10.**Plots showing the predicted and observed SPI values for 3, 6, 9, and 12 month timeframes during validating the performance at sub-basin 2.

ID | Name | Longitude | Latitude | Elevation (m) | |
---|---|---|---|---|---|

S1 | 013306 | Oued Abtal | 0°40′33.97″ E | 35°28′03.59″ N | 354 |

S2 | 013401 | Sidi Abdelkader Djillali | 0°34′08.35″ E | 35°29′20.71″ N | 225 |

SPI Values | Drought Category | Probability (%) |
---|---|---|

Greater than or equal to 2.0 | Extremely wet | 2.3 |

Greater than or equal to 1.5 and less than 2.0 | Very wet | 4.4 |

Greater than or equal to 1.0 and less than 1.5 | Moderate wet | 9.2 |

Greater than or equal to −1.0 and less than 1.0 | Near normal | 68.2 |

Greater than or equal to −1.0 and less than −1.5 | Moderately dry | 9.2 |

Greater than or equal to −1.5 and less than −2.0 | Severely dry | 4.4 |

Less than or equal to −2.0 | Extremely dry | 2.3 |

Performance Indices | Formula | Range | Ideal Level | Description |
---|---|---|---|---|

Correlation coefficient | $\mathrm{CC}=\frac{{\sum}_{i=1}^{N}\left[\left({\mathrm{SPI}}_{Obs}-\overline{{\mathrm{SPI}}_{Obs}}\right)\left({\mathrm{SPI}}_{Pre}-\overline{{\mathrm{SPI}}_{Pre}}\right)\right]}{\sqrt{{\sum}_{i=1}^{N}{\left({\mathrm{SPI}}_{Obs}-\overline{{\mathrm{SPI}}_{Obs}}\right)}^{2}}\sqrt{{\sum}_{i=1}^{N}{\left({\mathrm{SPI}}_{Pre}-\overline{{\mathrm{SPI}}_{Pre}}\right)}^{2}}}$ | (−1 to +1) | +1 | Calculates how similar the observed value is to the expected value. |

Mean absolute error | $\mathrm{MAE}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}}\left|{\mathrm{SPI}}_{Pre},-{\mathrm{SPI}}_{Obs}\right|$ | (0 to ∞) | 0 | Analyzes the error size on an average. |

Root mean square error | $\mathrm{RMSE}=\sqrt{\frac{1}{N}{\displaystyle \sum _{i=1}^{N}}{\left[{\mathrm{SPI}}_{Pre}-{\mathrm{SPI}}_{Obs}\right]}^{2}}$ | (0 to ∞) | 0 | Indicates how observed values differ from estimated values. |

Relative absolute error | $\mathrm{RAE}={\displaystyle \sum _{i=1}^{N}}\left|{\mathrm{SPI}}_{Pre}-{\mathrm{SPI}}_{Obs}\right|/{\displaystyle \sum _{i=1}^{N}}\left|\overline{{\mathrm{SPI}}_{Obs}}-{\mathrm{SPI}}_{Obs}\right|$ | (0 to ∞) | 0 | Conducts a performance evaluation of the machine learning algorithm. |

Root relative squared error | $\mathrm{RRSE}={\displaystyle \sum _{i=1}^{N}}{\left({\mathrm{SPI}}_{Pre}-{\mathrm{SPI}}_{Obs}\right)}^{2}/{\displaystyle \sum _{i=1}^{N}}{\left(\overline{{\mathrm{SPI}}_{Obs}}-{\mathrm{SPI}}_{Obs}\right)}^{2}$ | (0 to ∞) | 0 | In contrast to RMSE, the relative squared error (RSE) allows the comparison of models with errors expressed in various units. |

Sub Basin Name | Inputs Variables | Target Variable |
---|---|---|

Sub-basin 1 | SPI-3 (t-1); SPI-3 (t-2) | SPI-3 |

SPI-6 (t-1); SPI-6 (t-2); SPI-6 (t-5); SPI-6 (t-7) | SPI-6 | |

SPI-9 (t-1) | SPI-9 | |

SPI-12 (t-1) | SPI-12 |

**Table 5.**Evaluation metrics of the model outputs for various timelines during the training and testing phases.

Model | Training Phase | Testing Phase | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

CC | MAE | RMSE | RAE | RRSE | CC | MAE | RMSE | RAE | RRSE | |

SPI-3 | ||||||||||

Support vector machine | 0.684 | 0.557 | 0.725 | 70.225 | 73.247 | 0.701 | 0.453 | 0.579 | 68.038 | 71.463 |

Additive regression | 0.711 | 0.544 | 0.696 | 68.638 | 70.366 | 0.677 | 0.475 | 0.596 | 71.313 | 73.552 |

Bagging | 0.796 | 0.466 | 0.606 | 58.722 | 61.195 | 0.645 | 0.478 | 0.622 | 71.721 | 76.792 |

Random subspace | 0.667 | 0.608 | 0.765 | 76.724 | 77.278 | 0.542 | 0.553 | 0.684 | 83.019 | 84.405 |

Random forest | 0.960 | 0.230 | 0.304 | 29.056 | 30.683 | 0.592 | 0.526 | 0.682 | 78.926 | 84.088 |

SPI-6 | ||||||||||

Support vector machine | 0.824 | 0.447 | 0.596 | 56.157 | 56.729 | 0.811 | 0.355 | 0.452 | 58.505 | 57.447 |

Additive regression | 0.833 | 0.442 | 0.581 | 55.454 | 55.309 | 0.770 | 0.402 | 0.492 | 66.219 | 62.530 |

Bagging | 0.864 | 0.399 | 0.529 | 50.116 | 50.396 | 0.800 | 0.360 | 0.461 | 59.361 | 58.638 |

Random subspace | 0.864 | 0.403 | 0.530 | 50.560 | 50.434 | 0.811 | 0.359 | 0.450 | 59.243 | 57.137 |

Random forest | 0.925 | 0.303 | 0.401 | 38.083 | 38.161 | 0.735 | 0.420 | 0.545 | 69.188 | 69.239 |

SPI-9 | ||||||||||

Support vector machine | 0.882 | 0.359 | 0.472 | 46.288 | 47.242 | 0.866 | 0.306 | 0.381 | 45.599 | 46.795 |

Additive regression | 0.878 | 0.371 | 0.479 | 47.823 | 47.953 | 0.822 | 0.362 | 0.440 | 53.940 | 53.993 |

Bagging | 0.909 | 0.321 | 0.415 | 41.398 | 41.605 | 0.845 | 0.339 | 0.414 | 50.576 | 50.830 |

Random subspace | 0.897 | 0.336 | 0.441 | 43.272 | 44.152 | 0.863 | 0.315 | 0.388 | 46.932 | 47.606 |

Random forest | 0.947 | 0.233 | 0.320 | 30.030 | 32.041 | 0.806 | 0.357 | 0.463 | 53.275 | 56.859 |

SPI-12 | ||||||||||

Support vector machine | 0.908 | 0.305 | 0.431 | 37.412 | 41.963 | 0.880 | 0.283 | 0.371 | 38.061 | 41.520 |

Additive regression | 0.898 | 0.341 | 0.454 | 41.772 | 44.268 | 0.823 | 0.343 | 0.453 | 46.134 | 50.702 |

Bagging | 0.927 | 0.278 | 0.386 | 34.138 | 37.596 | 0.866 | 0.305 | 0.394 | 41.092 | 44.043 |

Random subspace | 0.924 | 0.283 | 0.394 | 34.728 | 38.386 | 0.874 | 0.297 | 0.380 | 39.914 | 42.553 |

Random forest | 0.956 | 0.212 | 0.303 | 25.977 | 29.514 | 0.808 | 0.381 | 0.483 | 51.226 | 54.028 |

Model | CC | MAE | RMSE | RAE | RRSE |
---|---|---|---|---|---|

SPI-3 | |||||

Support vector machine | 0.674 | 0.569 | 0.734 | 71.44 | 73.94 |

Additive regression | 0.645 | 0.600 | 0.758 | 75.3839 | 76.384 |

Bagging | 0.638 | 0.600 | 0.764 | 75.325 | 76.964 |

Random subspace | 0.586 | 0.642 | 0.811 | 80.603 | 81.653 |

Random forest | 0.615 | 0.611 | 0.797 | 76.788 | 80.255 |

SPI-6 | |||||

Support vector machine | 0.818 | 0.455 | 0.606 | 56.654 | 57.369 |

Additive regression | 0.765 | 0.518 | 0.683 | 64.604 | 64.637 |

Bagging | 0.799 | 0.481 | 0.634 | 59.940 | 60.016 |

Random subspace | 0.799 | 0.481 | 0.632 | 59.915 | 59.885 |

Random forest | 0.749 | 0.565 | 0.724 | 70.457 | 68.512 |

SPI-9 | |||||

Support vector machine | 0.877 | 0.368 | 0.480 | 46.883 | 47.479 |

Additive regression | 0.840 | 0.420 | 0.543 | 53.613 | 53.726 |

Bagging | 0.856 | 0.396 | 0.517 | 50.529 | 51.186 |

Random subspace | 0.856 | 0.395 | 0.517 | 50.355 | 51.152 |

Random forest | 0.839 | 0.434 | 0.555 | 55.339 | 54.907 |

SPI-12 | |||||

Support vector machine | 0.908 | 0.305 | 0.431 | 37.412 | 41.963 |

Additive regression | 0.898 | 0.341 | 0.454 | 41.772 | 44.268 |

Bagging | 0.927 | 0.278 | 0.386 | 34.138 | 37.596 |

Random subspace | 0.924 | 0.283 | 0.394 | 34.728 | 38.386 |

Random forest | 0.956 | 0.212 | 0.303 | 25.977 | 29.514 |

CC | MAE | RMSE | RAE | RRSE | |
---|---|---|---|---|---|

SPI-3 | 0.642 | 0.575 | 0.751 | 72.927 | 77.370 |

SPI-6 | 0.811 | 0.445 | 0.580 | 58.958 | 58.548 |

SPI-9 | 0.865 | 0.379 | 0.493 | 51.304 | 50.172 |

SPI-12 | 0.885 | 0.331 | 0.466 | 42.81 | 46.639 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Achite, M.; Elshaboury, N.; Jehanzaib, M.; Vishwakarma, D.K.; Pham, Q.B.; Anh, D.T.; Abdelkader, E.M.; Elbeltagi, A.
Performance of Machine Learning Techniques for Meteorological Drought Forecasting in the Wadi Mina Basin, Algeria. *Water* **2023**, *15*, 765.
https://doi.org/10.3390/w15040765

**AMA Style**

Achite M, Elshaboury N, Jehanzaib M, Vishwakarma DK, Pham QB, Anh DT, Abdelkader EM, Elbeltagi A.
Performance of Machine Learning Techniques for Meteorological Drought Forecasting in the Wadi Mina Basin, Algeria. *Water*. 2023; 15(4):765.
https://doi.org/10.3390/w15040765

**Chicago/Turabian Style**

Achite, Mohammed, Nehal Elshaboury, Muhammad Jehanzaib, Dinesh Kumar Vishwakarma, Quoc Bao Pham, Duong Tran Anh, Eslam Mohammed Abdelkader, and Ahmed Elbeltagi.
2023. "Performance of Machine Learning Techniques for Meteorological Drought Forecasting in the Wadi Mina Basin, Algeria" *Water* 15, no. 4: 765.
https://doi.org/10.3390/w15040765