Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall

Ueda, Futo; Tanouchi, Hiroto; Egusa, Nobuyuki; Yoshihiro, Takuya

doi:10.3390/w17243449

Open AccessArticle

Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall

by

Futo Ueda

¹,

Hiroto Tanouchi

²,

Nobuyuki Egusa

² and

Takuya Yoshihiro

^2,*

¹

Graduate School of Systems Engineering, Wakayama University, Wakayama 640-8510, Japan

²

Faculty of Systems Engineering, Wakayama University, Wakayama 640-8510, Japan

^*

Author to whom correspondence should be addressed.

Water 2025, 17(24), 3449; https://doi.org/10.3390/w17243449

Submission received: 27 October 2025 / Revised: 18 November 2025 / Accepted: 28 November 2025 / Published: 5 December 2025

(This article belongs to the Special Issue Application of Big Data and Machine Learning in Hydrological Forecasting and Water Resource Management)

Download

Browse Figures

Versions Notes

Abstract

In our previous work, we proposed a river water-level prediction method using deep learning, incorporating radar rainfall data in place of water-level and rainfall stations upstream of the prediction point. By introducing a newly defined flow distance matrix, transfer learning becomes available, i.e., even when data at the prediction point is scarce, accurate water-level predictions are made using inundation data from other rivers. However, this approach requires pre-selecting rivers that behave similarly to the prediction point for training, making it laborious to build prediction models for multiple rivers. Furthermore, the previous study only performed predictions for a single river, raising uncertainty about whether the method is applicable to water-level prediction for other rivers with different conditions. In this paper, we construct a generalized river water-level prediction model commonly applicable to multiple Japanese rivers by using inundation data from all Japanese Class-A rivers (the major river systems managed by the government) for pre-training, rather than only the rivers similar to the prediction site. Through evaluation, we showed that pre-training using all Class-A rivers yields higher prediction accuracy than pre-training using similar rivers across multiple rivers with varying conditions. This demonstrates that using all Class-A rivers for pre-training enables the construction of a generalized river water-level prediction model applicable to a wide range of rivers.

Keywords:

water-level prediction; flood; transfer learning; CNN-LSTM; radar rainfall

1. Introduction

A large portion of Japan’s land area consists of mountainous terrain, resulting in numerous steep rivers that can experience rapid water-level rises during heavy rainfall. Therefore, swift evacuation is critically important in the event of river flooding. Furthermore, recent climate change has increased the frequency of intense short-duration rainfall events. In Japan, the number of rainfall events exceeding 50 mm per hour has increased by approximately 1.5 times [1]. As a result, the risk of flooding damage, including large-scale floods, has increased dramatically. Sudden heavy rainfall causes river water levels to rise rapidly, shortening the time until flooding occurs and increasing the likelihood of severe inundation damage. In fact, in recent years, the number of rivers exceeding the Flood Dangerous Water Levels (indicating a high risk of overflow) has been on the rise [2]. As a countermeasure against such disasters, extensive research has been conducted on predicting river water levels. Accurate prediction of water levels during flooding is essential for residents to make appropriate evacuation decisions. For instance, in Japan, issuing evacuation advisories based on flood forecasts made several hours in advance would save many lives.

Conventional water-level prediction methods have employed several physical models to simulate rainfall-runoff processes [3,4]. In recent years, deep learning-based models have been reported to achieve even higher prediction accuracy [5,6,7,8,9,10,11,12]. These studies primarily train prediction models using observed rainfall and water level data from upstream observation stations to forecast downstream water levels. However, many rivers lack sufficient observation stations necessary for accurate forecasting. In Japan’s small and medium-sized rivers, upstream observation stations are often absent. For example, during Typhoon No. 10 in 2016, flooding with human casualties occurred in Iwate Prefecture’s Ōmoto River, which had only one water-level observation station [13]. This highlights the need for high-precision water-level forecasting even in small rivers. Achieving reliable predictions even without upstream observation data enables appropriate evacuation in such river basins. Additionally, it should be noted that some large rivers lack sufficient water-level and rainfall observation stations, indicating a need for data enabling water-level prediction to substitute for upstream observation station data.

As an alternative to rainfall data from rain measurement stations, radar precipitation data observed and published by the Japan Meteorological Agency is available [14]. Radar precipitation data is observed nationwide on a 1 km grid and is derived from radar reflectivity. Unlike point-based rain gauge data, radar precipitation data provides spatially estimated precipitation amounts at high spatial resolution, enabling the capture of precipitation distribution across entire watersheds. Therefore, it possesses the potential to enable water-level forecasting even in river basins lacking upstream rainfall or water-level observation stations.

Furthermore, many small and medium-sized rivers, and even some large rivers, lack sufficient historical flood records, making it difficult to secure the large data volumes required for deep learning-based prediction models. To address this constraint, transfer learning has been presented as an effective method [15]. Transfer learning is widely used across various research fields [15], which reuses knowledge acquired from one task to learn another related task. By applying transfer learning, it becomes possible to construct water-level prediction models for target locations with limited flood records by utilizing data from other rivers with similar characteristics.

For this purpose, we previously proposed a river water-level prediction method using radar rainfall data and transfer learning [16]. This method achieved high prediction accuracy by using radar rainfall data instead of upstream water levels or rainfall observations. In this method, we first select water-level stations from rivers exhibiting similar trends to the target location and perform pre-training using past flood data from these rivers. Subsequently, we fine-tune the model using flood data from the target location. A key finding in this study is that introducing the newly defined “flow distance” feature enables this transfer learning approach to function effectively. Through evaluation, we demonstrated that the flow distance feature actually plays a crucial role in it. Furthermore, we showed that using radar rainfall data instead of upstream observation data enables water-level prediction several hours ahead with accuracy comparable to conventional models based on upstream observation data.

However, this method requires extracting similar rivers for each target river and constructing individual prediction models for each water-level observation station, complicating model development and management. Furthermore, our previous study [16] conducted performance evaluations on only a single target river, excluding elements such as dams that could potentially reduce prediction accuracy. Consequently, the prediction accuracy of the model for rivers with varying conditions, such as the presence or absence of dams and elevation differences, has not been evaluated.

In this study, instead of using only similar rivers for prior learning, we utilize historical flood data obtained from all Class-A rivers nationwide to construct a river water-level prediction model applicable to all Class-A rivers throughout Japan, where Class-A rivers are the designated major rivers managed by the Japanese government. Subsequently, we evaluate the prediction accuracy of this prediction model and show the ability of the prediction model pre-trained using all Class-A river data as a generalized water-level prediction model.

The remainder of this paper is organized as follows. Section 2 reviews related work. Section 3 describes the prediction model applied in this work. Section 4 presents the evaluation results using actual river observation data. Section 5 describes the limitation of this study. Finally, Section 6 summarizes this study.

2. Previous Predictive Models

Hitokoto et al. [5] proposed a river water-level prediction model based on deep learning and demonstrated that deep learning has the potential to predict river water levels more accurately than traditional physics-based models. Their method employed a Multi-Layer Perceptron (MLP) to construct the prediction model, which used rainfall and water-level data from upstream observation stations to predict the downstream water level. Yamada et al. [6] showed that higher prediction accuracy than MLP could be achieved by using Long Short-Term Memory (LSTM), which is a recurrent neural network suited for time-series data. By utilizing LSTM, the model can learn the temporal characteristics of water level and rainfall observations.

Chen et al. [8] introduced a convolutional model and proposed a prediction model using a Convolutional LSTM (ConvLSTM), which combines a Convolutional Neural Network (CNN) and an LSTM to handle two-dimensional time-series data with spatial information. They utilized rainfall data from 50 observation stations in Xixian County, Henan Province, China, to predict water levels, and showed that ConvLSTM was effective for water-level forecasting. Li et al. [9] also proposed a convolutional model, CNN-LSTM model, and applied it to the Hun River in China, obtaining high prediction accuracy for river water levels.

Xie et al. [10] introduced an ensemble learning model, which combines one-dimensional (1D) and two-dimensional (2D) CNNs, which also demonstrated high prediction performance. As more advanced water-level prediction approaches, Alizadeh et al. [11] and Wang et al. [12] introduced models that incorporate attention mechanisms [17]. Although these models obtained high predictive performance, they relied heavily on data from numerous observation stations located upstream of the forecast location. In Japan, few rivers have many rainfall and water-level observation stations, making it difficult to apply these methods to high-precision water-level prediction.

On the other hand, several studies have proposed water-level prediction models using radar rainfall. Baek et al. [18] proposed a CNN-based prediction model that utilized radar rainfall and past water-level data at the prediction site in Korea and demonstrated good prediction accuracy. Li et al. [19] introduced a CNN-LSTM model using radar rainfall to forecast downstream water levels in Germany based on radar rainfall and upstream observation data, also obtaining high accuracy. However, no study has compared the prediction accuracy of models using upstream observation data with that of models using only radar rainfall and past water levels at the prediction site. Therefore, it remains unclear whether radar rainfall can fully substitute for upstream gauge observations and achieve comparably high predictive performance.

Regarding river water-level prediction using transfer learning, Kimura et al. [20] developed a CNN-based prediction model and constructed a transfer learning model for predicting water levels one hour ahead using rainfall and water-level observation data from other river basins in Japan. This study demonstrated the effectiveness of transfer learning for short-term (one-hour) forecasting. However, constructing the training data requires selecting rivers exhibiting similar rainfall and water-level variation patterns to the target river. This data generation process is labor-intensive, making it difficult to apply the method to rivers with fewer observation points. Furthermore, this approach does not utilize radar rainfall data and requires upstream observation points for forecasting.

3. Method

3.1. Overview

The prediction model used in this study is built by combining CNN and LSTM. Unlike our previous study, we use the inundation data of all Japanese Class-A rivers rather than selected rivers with similar trends to the prediction point. As shown later, this pre-trained model can be commonly applied for water-level prediction at all stations on all Class-A rivers in Japan. By applying transfer learning that incorporates radar rainfall data, it is possible to predict water levels several hours in advance without relying on upstream gauge stations.

3.2. River Water-Level Prediction Model with CNN and LSTM

This study utilizes a river water-level prediction model that combines CNN and LSTM architectures. First, convolution and pooling operations are applied to radar rainfall and the flow distance matrix to compress spatial information. The compressed features are then concatenated with rainfall and water-level observations from both upstream and target-site gauging stations, and the resulting feature set is fed into the LSTM. The output of the LSTM is passed through a fully connected layer to generate the final water-level prediction. Because the water-level data used in this study are recorded at hourly intervals, the LSTM time steps are aligned to a 1 h resolution. Radar rainfall data, which are provided in 10 min intervals, are aggregated into 1 h intervals, and the six resulting time slices are stacked as input channels for the CNN. Thus, the CNN input comprises seven channels: six channels of radar rainfall and one channel of flow distance. After the CNN performs convolution and pooling operations on this input, the extracted features are combined with rainfall and water-level observations, which are then input to the LSTM at each time step. Finally, the output of the last LSTM step is passed through a fully connected layer, and its output is used as the predicted water level at the target site. In this study, because we consider situations with limited training data, we employ a simplified architecture where each of the CNN, LSTM, and fully connected layers consists of only a single layer.

A schematic of the proposed prediction model is shown in Figure 1. Each row in the figure corresponds to one time step of the LSTM, aligned with a 1 h interval as mentioned above. Arrows indicate the paths of information flows. As shown on the left side of each row, the CNN block consists of a convolution layer and a pooling layer. Its output is used as the spatial feature input for that time step. On the right side, the LSTM block receives the CNN output as well as the observed values from water level and rainfall gauges as input at each time step. The output from the final LSTM time step is fed into a fully connected layer, and the resulting value is used as the predicted water level for the target location k hours later, where k is a preliminary determined value. In short, at each time step of the LSTM, we input the flow distance matrix and the six 10 min interval radar rainfall images of the most recent hour and output the predicted water level k hours later. By repeating this process, we obtain a k-hour-ahead water-level prediction. As the computing time for the prediction of each LSTM step takes less than 1 s, the prediction steps operate in real time.

3.3. Transfer Learning Methodology

3.3.1. Flow Distance Matrix

This study uses transfer learning to predict river water levels. A key challenge in this setting is that the terrain and flow direction of the river used for pre-training may differ significantly from the target river. Consequently, the spatial distribution pattern of rainfall at the prediction site and its resulting impact on water-level elevation vary for each river. To address this issue, we introduce the “runoff distance” feature, which quantifies the distance that water travels from each radar rainfall cell to the prediction location. By incorporating runoff distance data into the model input, we aim to capture the temporal lag between rainfall at each location and its impact on water levels at the target location. This approach enables the model to accommodate basin-specific variations, thereby promoting transfer learning between rivers with heterogeneous topography more effectively.

The procedure for generating flow distance data is described below. We utilize a surface flow direction dataset [21], which provides high-resolution information (1 s arc in both latitude and longitude), indicating the direction of surface runoff on each grid cell among the eight possible neighboring directions. For each grid cell, we trace the water flow path according to its assigned direction in the dataset, accumulating the number of grid cells traversed until reaching the cell corresponding to the prediction location. This cumulative count is defined as the flow distance. If the water path does not eventually reach the prediction location, the flow distance for that cell is marked as null. After calculating the flow path distance for all 1 s cells, the data are transformed to match the resolution of the radar rainfall grid (30 s latitude and 45 s longitude, corresponding to approximately 1 km by 1 km cells). For each radar rainfall cell, the average flow path distance of the 1 s resolution cells contained within that area is calculated. The resulting matrix provides a spatially consistent flow-path distance map that can be used as input to our prediction model, in combination with the radar rainfall map.

A specific example of the flow distance data generation process is illustrated in Figure 2. This figure demonstrates how flow distance data at an approximate 1 km resolution are generated from surface flow direction data with a resolution of 1 s in both latitude and longitude. In the figure, the shaded red cells with diagonal patterns represent the water-level prediction points, and the arrows indicate the flow direction for each cell provided in the surface flow direction dataset. First, based on the surface flow direction matrix (Figure 2a), we compute the flow distance at 1 s resolution (Figure 2b). The numeric values within each cell denote the number of grid cells traversed along the flow path from that cell to the prediction location, that is, the flow distance. Next, we perform spatial averaging over 30 cells in the latitude direction and 45 cells in the longitude direction to resample the data to the approximate 1 km grid, which matches the radar rainfall dataset. This results in the flow distance map at 1 km resolution shown in Figure 2c, where each cell contains the mean flow distance of the corresponding 1 s resolution cells within that area. The resulting flow distance values provide features aligned with the radar rainfall mesh, enabling the prediction model to consider the spatial delay of rainfall impact based on a terrain-derived hydrological structure.

3.3.2. Transfer Learning Method

The procedure for the transfer learning approach used in this study is as follows. First, we train the prediction model described in Section 3.2 using the pre-trained dataset explained in Section 3.3.3, which includes all Class-A rivers in Japan. Next, this pre-trained model is fine-tuned using the limited amount of flood data available at the target location. Specifically, we use the pre-trained weights as initial parameters and re-train the entire model’s weights using the limited data from the target prediction location. By leveraging this two-stage training strategy, we aim to mitigate the shortage of training data specific to the prediction location. The prediction model used in this study is relatively simple (with only 1–2 layers for each of CNN, LSTM, and fully connected layers), and we chose to update all weights during fine-tuning. To prevent overfitting and ensure stable convergence during this fine-tuning phase, a smaller learning rate is employed compared to the pre-training phase, allowing for gradual adjustment of the weights. Furthermore, we posit that the differences in terrain between the pre-training rivers and the target river—such as variations in topography that affect rainfall-runoff dynamics—can be effectively accounted for by incorporating the flow distance information described earlier into the model input. This enables the model to generalize across rivers with varying hydrological and geographical characteristics.

The transfer learning framework applied in this study is illustrated in Figure 3. Figure 3a depicts the pre-training phase, while Figure 3b shows the fine-tuning phase. The prediction model comprises a CNN, an LSTM, and a fully connected layer, utilizing the same model architecture for both pre-training and fine-tuning. In the pre-training phase shown in Figure 3a, the model is trained using the pre-training dataset, which consists of flood events from all Class-A rivers in Japan. After the pre-training, we go to the fine-tuning phase, where all the weights learned in the pre-training are used as the initial weights, as shown in Figure 3b. During fine-tuning, the model is re-trained using training data from the target location. In this phase, all model weights are updated, enabling the model to adapt to the specific characteristics of the target river.

3.3.3. Rivers in the Pre-Training Dataset

Our previous work [16] utilizes radar rainfall and transfer learning, but the dataset for pre-training is selected based on similarity to the target river. Specifically, for all water-level observation stations along Class-A rivers in Japan, we first extract the period during which the highest peak water level was recorded between 2006 and 2021. Then, we calculate the Pearson Correlation Coefficient of the water levels between each station and the target station using the 120 h interval of the extracted flood periods. Observation stations with a correlation coefficient of 0.75 or higher are selected as similar river stations. Here, each 120 h interval is defined as the period starting 72 h before the peak water level and ending 48 h after. However, this method requires extracting similar river stations in advance for each target location and conducting pre-training individually for each prediction location. As a result, the prediction model must be built separately for each river, making it difficult to generalize the operation to multiple rivers.

In this study, to overcome this limitation, we build a water-level prediction model that is commonly applicable to all Class-A rivers in Japan. To achieve this, we build a large dataset that includes all the inundation records in Class-A rivers in Japan from 2006 to 2021. Specifically, we extract all Class-A rivers in Japan for which a Flood Dangerous Water Level has been exceeded at least once between 2006 and 2021, based on the officially designated Flood Dangerous Water Levels [22,23]. After excluding the target sites, a total of 247 locations are identified. For each of these sites, we then extract all flood events from 2006 to 2021 during which the water level exceeded the Warning Water Level. After excluding periods with missing radar rainfall data, we obtain a total of 2180 flood periods to use as the pre-training data. This pre-training using the comprehensive dataset demonstrates that it is possible to construct a generalized river water-level prediction model that does not require reconstruction for each individual river. This enables the development of pre-trained models that unify the model construction process across diverse rivers, significantly reducing the required effort.

3.4. Evaluation Method

This study proposes utilizing all Class-A rivers in Japan as pre-training data sources and demonstrates the feasibility of constructing a generalized river water-level prediction model applicable to multiple rivers with varying conditions. Specifically, we compare the performance of the following two cases:

Without transfer learning: The CNN–LSTM model is trained solely using water-level, radar rainfall, and flow distance data from the target site, without any pre-training.
With transfer learning: The CNN–LSTM model is pre-trained using data from all Class-A rivers in Japan and then fine-tuned using water-level, radar rainfall, and flow distance data from the target location.

For evaluation, the model was trained using data up to 11 h prior to the reference time and configured to predict water levels from 1 h after to 12 h after the reference time. Prediction performance was evaluated using leave-one-out cross-validation, and the evaluation metric used was the Mean Squared Error (MSE) and Nash–Sutcliffe efficiency (NSE). The formulation of MSE and NSE is given in Equations (1) and (2).

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(Q_{o i} - Q_{s i})}^{2},

(1)

NSE = 1 - \frac{\sum_{i = 1}^{n} {(Q_{o i} - Q_{s i})}^{2}}{\sum_{i = 1}^{n} {(Q_{o i} - μ_{o})}^{2}},

(2)

where

Q_{o i}

and

Q_{s i}

denote the i-th observed and predicted water levels, respectively, n is the total number of data points, and

μ_{o}

is the average of the measurements.

3.5. Study Basin

The water-level prediction sites used in this study are listed in Table 1. These sites were selected from Japan’s Class-A rivers based on the criterion of diversity in physical characteristics such as the presence of upstream dams, reference elevation, and watershed area. Here, reference elevation denotes the height above mean sea level in Tokyo Bay. Furthermore, to obtain proper results for the fair evaluation of the generalized prediction model, these four sites were excluded from the pre-training dataset. The prediction period is defined as the duration during which water levels exceeding the designated Flood Control Standby Water Level for each observation site occurred between 2006 and 2021 [22,23]. Each period consists of 120 h, which includes the 72 h preceding and the 48 h following the peak water level observed during that period. The pre-training dataset consists of 247 observation points and 2180 flood periods, as described in Section 3.3.3.

3.6. Dataset

The water-level and rainfall data used in this study were obtained from the Hydrological and Water Quality Database [24], with a temporal resolution of 1 h. Radar rainfall data were acquired from the Japan Meteorological Business Support Center [25], covering the entire country on an approximately 1 km mesh (30 s latitude × 45 s longitude) at 10 min intervals. Surface flow direction data were obtained from the Japan Surface Flow Direction Map [21]. The spatial resolution is approximately 30 m (1 s latitude × 1 s longitude), covering the entire Japanese archipelago. These data represent the surface flow direction for each grid cell. For both the target prediction location and the pre-training locations, radar rainfall inputs are extracted over a 60 × 60 km area centered at each measurement location, using the surface flow direction data as spatial context. The surface flow direction data are converted into a flow distance matrix with a 1 km mesh resolution using the procedure described in Section 3.3.1. The flow distance matrix is then aligned spatially with the radar rainfall data so that both cover the same 60 × 60 km region with the same resolution for each of the prediction and pre-training locations.

3.7. The Detailed Prediction Model

It is well known that biased training data can negatively affect model learning. Therefore, prior to training, the input data were normalized such that all values fall within the range of

[0, 1]

. The normalization was performed using the formula shown in Equation (3).

x_{i}^{'} = \frac{x_{i} - \min (x)}{\max (x) - \min (x)},

(3)

where

x_{i}^{'}

denotes the normalized value,

x_{i}

is the original (pre-normalized) value,

\min (x)

represents the minimum value in the training data, and

\max (x)

represents the maximum value in the training data. The hyperparameter settings used for training the prediction model without transfer learning are listed in Table 2, while those used for the transfer learning-based model are summarized in Table 3. The configuration of the CNN layers used in both models is shown in Table 4. During pre-training, the number of training epochs was set to 1000 to ensure sufficient convergence. For fine-tuning, AdamW [26,27] was employed with a small learning rate to perform fine adjustments of the model weights.

In this study, for each prediction site, we determined the optimal parameters for the two prediction models used in evaluation by conducting a preliminary comparison of prediction accuracy under different parameter settings. Specifically, we varied the number of LSTM layers between 1 and 2, and the early stopping patience between 50 and 100 epochs. When using a two-layer LSTM, the dropout rate was set to 0.3. The combination of parameter values that achieved the highest prediction accuracy for each prediction model and site is summarized in Table 5. These parameter settings were then used for all subsequent evaluations and accuracy comparisons.

4. Results and Discussion

4.1. Prediction Accuracy

Figure 4 presents the prediction accuracy of water levels 1 to 12 h ahead at each prediction location using MSE and NSE. The horizontal axis represents the predicted time after the reference time, while the vertical axis represents prediction accuracy in MSE. The subfigures correspond to the following prediction sites: (a1,a2) Hiwatashi, (b1,b2) Takatsuno, (c1,c2) Maoroshi, and (d1,d2) Momoyama.

From Figure 4, it can be observed that the application of transfer learning improved prediction accuracy at all sites, despite differences in river conditions. Particularly at Hiwatashi and Maoroshi, where the number of training samples was relatively small, prediction accuracy improved significantly. Conversely, as shown in Figure 4(b1,b2), while transfer learning improved the MSE at Takatsuno, the magnitude of this improvement was comparatively small. This is likely because the non-transfer model had already achieved high accuracy at this site in terms of MSE, leaving limited room for improvement through transfer learning. These results demonstrate that the transfer learning approach, which utilises flow distance as input and performs pre-training using all Class-A rivers in Japan, can consistently enhance prediction performance across rivers with diverse conditions. This proves that constructing a generalised river water-level prediction model applicable to multiple river systems is achievable.

Figure 5, Figure 6, Figure 7 and Figure 8 show hydrographs of 2 h ahead predictions during the top-three peak-water-level flood periods at each prediction site. Specifically, Figure 5 corresponds to the Hiwatashi site, Figure 6 to the Takatsuno site, Figure 7 to the Maoroshi site, and Figure 8 to the Momoyama site. In each figure, labels (a1)–(a3) display the hydrographs generated by the prediction model without transfer learning, while labels (b1)–(b3) show those produced with transfer learning. The horizontal axis indicates time (in days), and the vertical axis shows water level (in meters). From Figure 5, Figure 6, Figure 7 and Figure 8, it is evident that the models without transfer learning tend to overestimate water levels during periods of rapid increase. In contrast, the transfer learning-based models successfully mitigate this effect, yielding more accurate water-level predictions. Moreover, as shown in Figure 7, even in cases where the non-transfer model underestimates the water level, the transfer learning approach corrects this discrepancy, resulting in improved accuracy. In addition to peak periods, the hydrographs indicate that low-water-level periods are also predicted with higher accuracy when using transfer learning. This demonstrates the model’s robustness not only during floods but also under moderate conditions, thereby contributing to its overall predictive reliability.

These results collectively suggest that the transfer learning pre-trained on all Class-A rivers in Japan effectively reduces prediction errors and improves prediction accuracy across multiple rivers with differing hydrological and topographical characteristics. The flow distance-based transfer learning approach demonstrated good performance for a variety of Japanese Class-A rivers, simultaneously showing that radar rainfall data can substitute for upstream observation station data in predicting water levels several hours ahead.

4.2. Effect of Data Reduction

To further evaluate the effectiveness of the transfer learning approach, we conducted experiments in which the number of training periods at each prediction site was intentionally reduced, and the resulting prediction accuracy was compared. Specifically, from the thirteen flood periods used in the original evaluation, we extracted the top-four and top-seven periods based on peak water levels and performed leave-one-out cross-validation on these subsets. In other words, prediction models were trained using three, six, and twelve periods of training data, and their accuracy was compared. To ensure a consistent basis for comparison, predictions were evaluated using the same top-four flood periods across all cases.

Figure 9 shows the prediction results for the four evaluation sites: (a1,a2) corresponds to the Hiwatashi site, (b1,b2) Takatsuno, (c1,c2) Maoroshi, and (d1,d2) Momoyama. As shown in all sites, the prediction models without transfer learning (in Figure 9(a1–d1)) experienced a significant degradation in accuracy as the number of training periods decreased. In contrast, the models with transfer learning (in Figure 9(a2–d2)) were able to maintain stable prediction accuracy, even when trained with only a small number of periods. These results demonstrate that, even in cases where the amount of available water-level data at the target site is limited, the transfer learning approach pre-trained on all Class-A rivers in Japan can provide reliable and accurate water-level predictions across multiple rivers with differing conditions.

Next, we compare the predictive accuracy of the models using hydrographs to visualize the effect of transfer learning under conditions with limited training data. Figure 10 shows the hydrograph at the (a) Hiwatashi site, (b) Takatsuno, (c) Maorishi, and (d) Momoyama, respectively. In each figure, labels (1)–(2) show the comparison of the 1 h ahead predictions for the fifth-highest peak period using six training periods, and (3)–(4) show the results during the fourth-highest peak period using three training periods. In all hydrographs, the horizontal axis indicates time (in days), and the vertical axis shows water level (in meters). From all figures, it can be observed that the model without transfer learning tends to overpredict water levels relative to the observed data. In contrast, the transfer learning-based model corrects this overestimation tendency, resulting in improved prediction accuracy. These results further demonstrate that even with limited training data at the target site, the use of transfer learning—pre-trained on all Class-A rivers in Japan—can effectively correct small errors and upward bias, thereby enhancing the predictive performance across rivers with varying conditions.

4.3. Effect of Pre-Training Dataset

We evaluate the impact of changing the pre-training dataset on model performance. Specifically, at the Hiwatashi site, we compare the prediction accuracy of three cases (i.e., models) shown in Table 6. The hyperparameter settings for the prediction model pre-trained on rivers similar to the target site are listed in Table 7, while those for the other models are given in Table 2 and Table 3.

Figure 11 presents the prediction accuracy of the three models described above. As we see, the model pre-trained on all Class-A rivers outperformed the model pre-trained only on rivers similar to the target site. This improvement in accuracy is likely due to the model’s ability to learn from rivers exhibiting similar flood behavior patterns, even when such similarity as adopted in [16] cannot be fully captured by correlation coefficients alone. These findings suggest that it is possible to construct a generalized river water level prediction model that does not require manually extracting similar rivers for each target site. This enables consistently high accuracy across multiple river systems by performing pre-training using water-level measurement data from all Class-A rivers in Japan.

5. Limitations

Dam: The evaluation results demonstrated that the predictive model is applicable to various river characteristics, particularly achieving favorable prediction performance in river systems with dams, such as those at Maoroshi and Momoyama. The successful prediction can be explained by the typical dam operation methods prescribed by law. This operation involves reducing reservoir storage before anticipated heavy rainfall and stabiliZing discharge adjustments during rainfall to suppress flow peaks. While successful in predicting water levels under this operation, a limitation of this study is that prediction accuracy may decrease under different dam operation regimes.

Small Rivers: The ultimate goal of our research is to achieve water-level prediction for small rivers. However, this paper has succeeded in predicting only Class-A rivers using a generic model trained on Class-A river data. The runoff characteristics of small rivers could be fundamentally different. As flood data for small rivers is either insufficient or of poor quality, one of the next challenges is to predict water levels in small rivers using flood data from large rivers, which is another limitation of this study.

Radar Data Quality: Radar precipitation data inherently contains a certain degree of error, as it estimates rainfall from radar reflections. Furthermore, the data is provided on a coarse 1 km mesh, with a temporal resolution of 10 min. Another limitation of this paper is the inability to evaluate the quality effects of the radar data. As the government has started providing 250 m mesh data from 2019, it may become possible to examine the effects of geometric mesh resolution in the near future.

6. Conclusions

In this study, we developed a generalized river water level prediction model using transfer learning that incorporates radar rainfall and flow distance as input features. The model was pre-trained on all Class-A rivers in Japan, rather than selecting only rivers similar to the target site. By applying the pre-trained model to multiple rivers with varying characteristics, we aimed to build a generalizable water-level prediction model that could be applied across diverse river basins.

For evaluation, we conducted hours-ahead predictions using historical water level records and examined the effects of varying the pre-training dataset across multiple rivers under different conditions. The results demonstrated that pre-training on all Class-A rivers in Japan consistently improved prediction accuracy across diverse target sites, even more so than when pre-training was performed only on similar rivers. Furthermore, we showed that the transfer learning approach was effective even when only a small number of training periods were available at the prediction site. The model was able to correct prediction errors and achieve accuracy comparable to cases with larger training datasets.

These findings suggest that a river water-level prediction model pre-trained on all Class-A rivers in Japan has strong potential to serve as a generalized prediction model, capable of being effectively applied to multiple rivers with different hydrological and topographical conditions. In summary, our results shown in this paper support the potential to construct a generalized water-level prediction model applicable to a wide range of rivers by utilizing pre-training using a large number of rivers with diverse flow regimes.

Future research tasks could include incorporating various datasets, such as land-use data and soil characteristic data, which significantly influence runoff behavior. Another potential future research task is attempting to predict water levels for small rivers.

Author Contributions

Methodology, F.U., H.T. and T.Y.; Software, F.U.; Validation, T.Y.; Resources, H.T.; Data Curation, F.U. and H.T.; Writing—Original Draft, F.U. and T.Y.; Writing—Review and Editing, N.E. and T.Y.; Supervision, N.E. and T.Y.; Project Administration, T.Y.; Funding Acquisition, T.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Takahashi Industrial and Economic Research Foundation.

Data Availability Statement

We do not have the right to share it in public. As written in Section 3.6, all data used in our study can be obtained from http://www1.river.go.jp/, http://www.jmbsc.or.jp/jp/, and https://hydro.iis.u-tokyo.ac.jp/~yamadai/JapanDir/ (all accessed on 15 March 2025) for free or for a fee in the market principles.

Conflicts of Interest

The authors declare no conflict of interest. The authors declare that this study received funding from Takahashi Industrial and Economic Research Foundation. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.

References

Ministry of Land, Infrastructure, Transport and Tourism (MLIT). Overview of River Projects 2023. 2023. Available online: https://www.mlit.go.jp/river/pamphlet_jirei/kasen/gaiyou/panf/pdf/2023/kasengaiyou2023_all.pdf (accessed on 15 March 2025).
Ministry of Land, Infrastructure, Transport and Tourism (MLIT). Measures Against Water-Related Disasters in Light of Climate Change. Available online: https://www.mlit.go.jp/river/kokusai/pdf/hurricane/pdf11.pdf (accessed on 15 March 2025).
Tachikawa, Y.; Sayama, T.; Takara, K.; Matsuura, H.; Yamazaki, T.; Yamaji, A.; Michihiro, Y. Development of a real-time runoff prediction system using a large-scale distributed hydrological model and its application to the Yodo River basin. J. JSNDS 2007, 26, 189–201. (In Japanese) [Google Scholar]
Nakamura, Y.; Koike, T.; Abe, S.; Nakamura, K.; Sayama, T.; Ikeuchi, K. Development of a river water level prediction technique using the RRI model with a particle filter. J. JSCE Ser. B1 (Hydraul. Eng.) 2018, 74, I_1381–I_1386. (In Japanese) [Google Scholar] [CrossRef] [PubMed]
Hitokoto, M.; Sakuraba, M.; Kiyoshi, Y. Development of a river water level prediction method using deep learning. J. JSCE Ser. B1 (Hydraul. Eng.) 2016, 72, I_187–I_192. (In Japanese) [Google Scholar] [CrossRef] [PubMed]
Yamada, K.; Kobayashi, Y.; Nakatsugawa, M.; Kishigami, J. Water level prediction of the 2016 Tokoro River flood event using a recurrent neural network. J. JSCE Ser. B1 (Hydraul. Eng.) 2018, 74, I_1369–I_1374. (In Japanese) [Google Scholar] [CrossRef] [PubMed]
Araki, K.; Hakoishi, K.; Hitokoto, M.; Shimamoto, T.; Fusamae, K. River water level prediction using radar rainfall and convolutional neural networks. Proc. River Technol. Pap. 2019, 25, 297–302. (In Japanese) [Google Scholar]
Chen, C.; Jiang, J.; Liao, Z.; Zhou, Y.; Wang, H.; Pei, Q. A short-term flood prediction based on spatial deep learning network: A case study for Xi County, China. J. Hydrol. 2022, 607, 127535. [Google Scholar] [CrossRef]
Li, X.; Xu, W.; Ren, M.; Jiang, Y.; Fu, G. Hybrid CNN-LSTM models for river flow prediction. Water Supply 2022, 22, 4902–4919. [Google Scholar] [CrossRef]
Xie, Y.; Sun, W.; Ren, M.; Chen, S.; Huang, Z.; Pan, X. Stacking ensemble learning models for daily runoff prediction using 1D and 2D CNNs. Expert Syst. Appl. 2023, 217, 119469. [Google Scholar] [CrossRef]
Alizadeh, B.; Bafti, A.G.; Kamangir, H.; Zhang, Y.; Wright, D.B.; Franz, K.J. A novel attention-based LSTM cell post-processor coupled with Bayesian optimization for streamflow prediction. J. Hydrol. 2021, 601, 126526. [Google Scholar] [CrossRef]
Wang, Y.; Huang, Y.; Xiao, M.; Zhou, S.; Xiong, B.; Jin, Z. Medium-long-term prediction of water level based on an improved spatio-temporal attention mechanism for long short-term memory networks. J. Hydrol. 2023, 618, 129163. [Google Scholar] [CrossRef]
Ministry of Land, Infrastructure, Transport and Tourism (MLIT). River Information Policy of the MLIT. Available online: https://www.river.or.jp/kougi2021_3.pdf (accessed on 15 March 2025).
Japan Meteorological Agency (JMA). Weather Radar: Principles and Explanation. Available online: https://www.jma.go.jp/jma/kishou/know/radar/kaisetsu.html (accessed on 15 March 2025).
Weiss, K.; Khoshgoftaar, T.M.; Wang, D.A. A survey of transfer learning. J. Big Data 2016, 3, 1–40. [Google Scholar] [CrossRef]
Ueda, F.; Tanouchi, H.; Egusa, N.; Yoshihiro, T. A Transfer Learning Approach Based on Radar Rainfall for River Water-Level Prediction. Water 2024, 16, 607. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Baek, S.; Pyo, J.; Jong, A.C. Prediction of water level and water quality using a CNN-LSTM combined deep learning approach. Water 2020, 12, 3399. [Google Scholar] [CrossRef]
Li, P.; Zhang, J.; Krebs, P. Prediction of flow based on a CNN-LSTM combined deep learning approach. Water 2020, 14, 993. [Google Scholar] [CrossRef]
Kimura, N.; Yoshinaga, I.; Sekijima, K.; Azechi, I.; Baba, D. Convolutional neural network coupled with a transfer-learning approach for time-series flood predictions. Water 2020, 12, 96. [Google Scholar] [CrossRef]
Yamada, T. Surface Flow Direction Map of Japan. Available online: https://hydro.iis.u-tokyo.ac.jp/~yamadai/JapanDir/ (accessed on 15 March 2025).
Ministry of Land, Infrastructure, Transport and Tourism (MLIT). List of Flood Prevention Water Levels for Nationally Managed Rivers. Available online: https://www.mlit.go.jp/river/toukei_chousa/kasen_db/pdf/2025/12-1-8.pdf (accessed on 15 March 2025).
Ministry of Land, Infrastructure, Transport and Tourism (MLIT). Implementation Guidelines for the Review of Disaster Information Systems Related to Floods and Water Management. Available online: https://www.mlit.go.jp/river/shishin_guideline/gijutsu/saigai/tisiki/disaster_info-system/ (accessed on 15 March 2025).
Ministry of Land, Infrastructure, Transport and Tourism (MLIT). Hydrological and Water Quality Database. Available online: http://www1.river.go.jp/ (accessed on 15 March 2025).
Japan Meteorological Business Support Center (JMBSC). Available online: http://www.jmbsc.or.jp/jp/ (accessed on 15 March 2025).
Loshchilov, I.; Hutter, F. Fixing weight decay regularization in Adam. In Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 2nd International Conference on Learning Representations (ICLR), Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
Prechelt, L. Early stopping—But when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2012; pp. 53–67. [Google Scholar]
Nair, V.; Hinton, G.E. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]

Figure 1. River water-level prediction model with CNN and LSTM.

Figure 2. Generation of flow distance matrix from surface flow direction.

Figure 3. Transfer learning framework. (a) Pre-training using data from multiple rivers. (b) Fine-tuning using data from the target river. All model weights are updated during both phases.

Figure 4. Prediction performance at each place.

Figure 5. Hydrographs of top-3 water-level periods at Hiwatashi for predicting 2 h later. (a1) Highest period w/o transfer learning, (a2) 2nd highest period w/o transfer learning, (a3) 3rd highest period w/o transfer learning, (b1) highest period w/ transfer learning, (b2) 2nd highest period w/ transfer learning, and (b3) 3rd highest period w/ transfer learning.

Figure 6. Hydrographs of top-3 water-level periods at Takatsuno for predicting 2 h later. (a1) Highest period w/o transfer learning, (a2) 2nd highest period w/o transfer learning, (a3) 3rd highest period w/o transfer learning, (b1) highest period w/ transfer learning, (b2) 2nd highest period w/ transfer learning, and (b3) 3rd highest period w/ transfer learning.

Figure 7. Hydrographs of top-3 water-level periods at Maoroshi for predicting 2 h later. (a1) Highest period w/o transfer learning, (a2) 2nd highest period w/o transfer learning, (a3) 3rd highest period w/o transfer learning, (b1) highest period w/ transfer learning, (b2) 2nd highest period w/ transfer learning, and (b3) 3rd highest period w/ transfer learning.

Figure 8. Hydrographs of top-3 water-level periods at Momoyama for predicting 2 h later. (a1) Highest period w/o transfer learning, (a2) 2nd highest period w/o transfer learning, (a3) 3rd highest period w/o transfer learning, (b1) highest period w/ transfer learning, (b2) 2nd highest period w/ transfer learning, and (b3) 3rd highest period w/ transfer learning.

Figure 9. Performance at each place with varying data amounts.

Figure 10. Hydrograph at each place with varying training data amounts. (a1) Hiwatashi 6 periods w/o TL. (a2) Hiwatashi 6 periods w/ TL. (a3) Hiwatashi 3 periods w/o TL. (a4) Hiwatashi 3 periods w/ TL. (b1) Takatsuno 6 periods w/o TL. (b2) Takatsuno 6 periods w/ TL. (b3) Takatsuno 3 periods w/o TL. (b4) 3 periods w/ TL. (c1) Maoroshi 6 periods w/o TL. (c2) Maoroshi 6 periods w/ TL. (c3) Maoroshi 3 periods w/o TL. (c4) Maoroshi 3 periods w/ TL. (d1) Momoyama 6 periods w/o TL. (d2) Momoyama 6 periods w/ TL. (d3) Momoyama 3 periods w/o TL. (d4) Momoyama 3 periods w/ TL.

Figure 11. Performance with different pre-training datasets.

Table 1. Prediction places.

Location	River Name	Prefecture	Dam	Zero-Point Height (m)	Drainage Area (km²)	# of Periods
Hiwatashi	Oyodo River	Miyazaki	Non	118	2230	13
Takatsuno	Takatsu River	Shimane	Non	0.22	1080	23
Maoroshi	Agano River	Niigata	Exist	0	771	13
Momoyama	Kiso River	Hyogo	Exist	637.43	9100	31

Table 2. Parameter values for prediction model w/o transfer learning.

Items	Values
Learning Rate	Adam [28] (Initial: 0.0001)
Learning Num	Early Stopping [29] (50,100 times)
Loss Function	MSE (Minimum Square Error)
Batch Size	50
LSTM Units	500

Table 3. Parameter values for prediction model w/ transfer learning.

Item	Values
Learning Rate	Pre-Training: Adam [28] (Initial: 0.001)
Learning Rate	Fine-Tuning: AdamW (Initial: 0.0001)
Learning Times	Pre-Training: 1000 times
Learning Times	Fine-Tuning: Early Stopping (50,100 times)
Loss Function	MSE (Minimum Square Error)
Batch Size	Pre-Training: 1024
Batch Size	Fine-Tuning: 50
LSTM Units	500

Table 4. Parameter values for CNN model.

Parameters		Values
Convolutional Layer	Kernel Size	3 × 3
	Number of Filters	7
	Stride	2
	Activation Function	ReLU [30]
Pooling Layer	Type	Max Pooling
Pooling Layer	Window Size	2 × 2
Dropout		0.9

Table 5. The best parameter values for each prediction place.

Prediction Models	Hiwatashi	Maoroshi	Takatsuno	Momoyama
w/o Transfer Learning	2 Layers	2 Layers	2 Layers	2 Layers
w/o Transfer Learning	ES: 50	ES: 100	ES: 50	ES: 50
w/ Transfer Learning	1 Layer	1 Layer	1 Layer	1 Layer
w/ Transfer Learning	ES: 100	ES: 100	ES: 100	ES: 100

Table 6. Three compared models.

ID	Labels	Description
A	w/o transfer learning (TL)	Prediction without the use of any pre-training.
B	TL w/ similar rivers	Prediction using transfer learning, where pre-training is conducted on rivers similar to the target site, as we performed in [16].
C	TL w/ all Class-A rivers	Prediction using transfer learning, where pre-training is conducted on all Class-A rivers in Japan.

Table 7. Parameter values for training with similar rivers.

Items	Values
Learning Rate	Pre-Training: Adam [28] (Initial: 0.0001)
Learning Rate	Fine-Tuning: AdamW (Initial: 0.00001)
Learning Number	Pre-Training: 2000 times
Learning Number	Fine-Tuning: Early Stopping (50 times)
Loss Function	MSE (Mimimum Square Error)
Batch Size	Pre-Training: 90
Batch Size	Fine-Tuning: 50

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ueda, F.; Tanouchi, H.; Egusa, N.; Yoshihiro, T. Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall. Water 2025, 17, 3449. https://doi.org/10.3390/w17243449

AMA Style

Ueda F, Tanouchi H, Egusa N, Yoshihiro T. Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall. Water. 2025; 17(24):3449. https://doi.org/10.3390/w17243449

Chicago/Turabian Style

Ueda, Futo, Hiroto Tanouchi, Nobuyuki Egusa, and Takuya Yoshihiro. 2025. "Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall" Water 17, no. 24: 3449. https://doi.org/10.3390/w17243449

APA Style

Ueda, F., Tanouchi, H., Egusa, N., & Yoshihiro, T. (2025). Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall. Water, 17(24), 3449. https://doi.org/10.3390/w17243449

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Building a Generalized Pre-Training Model to Predict River Water-Level from Radar Rainfall

Abstract

1. Introduction

2. Previous Predictive Models

3. Method

3.1. Overview

3.2. River Water-Level Prediction Model with CNN and LSTM

3.3. Transfer Learning Methodology

3.3.1. Flow Distance Matrix

3.3.2. Transfer Learning Method

3.3.3. Rivers in the Pre-Training Dataset

3.4. Evaluation Method

3.5. Study Basin

3.6. Dataset

3.7. The Detailed Prediction Model

4. Results and Discussion

4.1. Prediction Accuracy

4.2. Effect of Data Reduction

4.3. Effect of Pre-Training Dataset

5. Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI