Flash-flood forecasting has emerged worldwide due to the catastrophic socio-economic impacts this hazard might cause and the expected increase of its frequency in the future. In mountain catchments, precipitation-runoff forecasts are limited by the intrinsic complexity of the processes involved, particularly its high rainfall variability. While process-based models are hard to implement, there is a potential to use the random forest algorithm due to its simplicity, robustness and capacity to deal with complex data structures. Here a step-wise methodology is proposed to derive parsimonious models accounting for both hydrological functioning of the catchment (e.g., input data, representation of antecedent moisture conditions) and random forest procedures (e.g., sensitivity analyses, dimension reduction, optimal input composition). The methodology was applied to develop short-term prediction models of varying time duration (4, 8, 12, 18 and 24 h) for a catchment representative of the Ecuadorian Andes. Results show that the derived parsimonious models can reach validation efficiencies (Nash-Sutcliffe coefficient) from 0.761 (4-h) to 0.384 (24-h) for optimal inputs composed only by features accounting for 80% of the model’s outcome variance. Improvement in the prediction of extreme peak flows was demonstrated (extreme value analysis) by including precipitation information in contrast to the use of pure autoregressive models.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited