1. Introduction
Road traffic is an essential part of the transport system and despite all efforts to move towards more sustainable solutions, the number of private and commercial vehicles continues to increase. In Germany, for example, a record high of 580 cars per 1000 inhabitants was reached in 2022 [
1]. Enhanced efforts to build up an intelligent transport system (ITS) could help handling the increasing demands on the road network [
2]. An ITS with interconnected infrastructure, vehicles and users can be useful to manage and control traffic, helping to direct traffic flow and prevent congestion and road accidents. Furthermore, sensor networks formed by connected vehicles can deliver data to understand influencing factors on driving behavior and help in making informed decisions regarding traffic management. One particularly important factor is weather, which can significantly influence how drivers behave on the road.
Precipitation, including different forms like rain, hail, sleet or snow, is one of the most influential weather phenomena affecting road traffic, consistently leading to reductions in driving speed. Stamos et al. [
3] summarized the effects of rainfall on driving speed in a review of 24 studies. The majority of these studies showed that rainfall leads to a reduction of driving speed, with the reduction being larger at higher rainfall intensities. However, there is large variability between the amounts of reduction. While most studies indicate reductions of between 0 and 10%, others show reductions of up to 35% in the case of extreme rainfall [
4]. A logarithmic regression function was found to be most suitable to describe the relationship between rainfall and driving speed [
5].
Previous research has often focused on studying rainfall effects on driving speed at selected locations or road sections, where long-term station-based speed measurements are available [
6,
7,
8,
9,
10,
11]. A common station-based approach to measure driving speed is using inductive loops [
12]. This allows assessment of the local driving characteristics for all vehicles passing a certain location. The identification and tracking of vehicles by camera-based traffic information systems allows for computation of average driving speeds for specific road sections [
13]. Such long-term measurements allow for robust statistics for the selected road sections. However, it often remains unclear to what extent the results are transferable to other road sections with different characteristics.
In recent years, novel data sources have become available, which open up new opportunities for analyses of driving speed. Navigation systems are now available in many vehicles, either as on-board devices or via smartphone apps. High-resolution information about the location of vehicles, obtained through global navigation satellite systems (GNSS), can be converted into driving speeds. This makes it possible to obtain speed information independently of station-based observations. Such information, also referred to as floating car data, has been successfully used to study the effect of rainfall on driving speeds [
3,
14,
15]. Although the driving speeds estimated from floating car data are only based on a subsample of all vehicles, it does allow for an assessment of driving speeds on various roads throughout the study area.
Linking rainfall to speed data at a high spatial and temporal resolution within a large area requires appropriate rainfall measurements. Many studies use rain gauge measurements [
10,
16,
17]. Rain gauges measure the amount of rainfall at a specific location with a high accuracy, but they are sparsely distributed in space, and measurements may not be representative for a larger area. In particular, in case of heavy rainfall, the spatial distribution of rainfall is very heterogeneous, and extreme rainfall is often not captured by rain gauges [
18]. Therefore, the traffic observations and weather station need to be close enough to ensure representative measurement of rainfall [
8]. Other studies use qualitative information on road surface conditions, indicating dry or wet conditions [
6], or infer rainfall from wiper settings [
19]. Such information is not always readily available and provides only rough estimates of rainfall intensity.
Precipitation radar, on the other hand, has the advantage of large coverage with high spatial resolution. Using radar data to estimate the effect of rainfall on driving speed has been applied by a few studies only [
13,
20,
21,
22]. While radar data has the benefit of a large spatial coverage, it has a lower accuracy of rainfall amount compared to gauge data. Some studies use data products based on an atmospheric model with assimilated radar information to assess the impact of rainfall on driving speeds [
23,
24]. However, such data is often only available at hourly resolution, limiting their suitability for short-term traffic analyses.
For Germany, a calibrated precipitation dataset is available that combines radar and station observations to provide precipitation estimates at a spatial resolution of 1 km and a temporal resolution of 5 min [
25]. This dataset offers a unique opportunity to investigate rainfall–speed relationships at high resolution across a large and heterogeneous road network. However, it should be noted that this dataset does not allow for a direct distinction between different types of precipitation.
The aim of this paper is to combine GNSS-based floating car data with such calibrated radar-based measurements for three individual summer days with heavy rainfall to estimate the effect of rainfall on driving speed for the area of Germany. Using regression models allows us to compare the functional relationship between rainfall and driving speed for road sections with different characteristics like speed limits or the number of lanes. Cross-validation is employed to assess predictive performance and to compare models using categorical versus continuous representations of rainfall. This study seeks to provide a first step toward country-scale, rainfall-aware speed modeling that can inform future developments in traffic prediction and management. However, in the presented form, the analysis should be interpreted strictly as a descriptive, event-specific study. To support generalizable inference across time, seasons, or broader traffic conditions, larger multi-event datasets and the consideration of effects like weekday structure and diurnal demand patterns are required.
2. Materials and Methods
2.1. Driving Speed
For the analysis, speed data based on GNSS probe data was provided by HERE Europe B.V. The data contains 5 min average driving speed
v at about 1.5 mio road sections in Germany (
Figure 1). The average speed is computed based on all vehicles contributing a measurement within a 5 min interval. The road sections in the dataset include primary, secondary and tertiary roads. Different road section characteristics are available, including the local speed limit, free-flow velocity, number of lanes, and an indicator for an urban road environment. Speed data is analyzed for three days with strong rainfall (Mon 20 May, Mon 3 June and Wed 12 June 2019). There were also periods and regions without rainfall present on each of the days.
2.2. Radar Data
The radar-based precipitation product RADKLIM (Radarbasierte Niederschlagsklimatologie) [
25,
26] is used to assign rainfall amounts to the driving speed observations of each road section. RADKLIM provides 5 min precipitation sums on a grid with a spatial resolution of
km for the area of Germany. RADKLIM combines radar reflectivities, measured by the 16 C-band Doppler radars of the German weather radar network, and ground-based gauge measurements. Since the exact amount of precipitation on the ground cannot be directly inferred from radar reflectivity, observations from rain gauges are used to calibrate the amounts of precipitation estimated from radar reflectivity. Furthermore, a statistical clutter filtering is applied, and shadowing effects are corrected. The RADKLIM dataset thus combines the benefits of high spatial resolution of the radar network and the accuracy of gauge-based measurements.
The RADKLIM product does not distinguish between different types of precipitation. However, the analyzed events occurred during the summer season and are therefore dominated by rainfall. Therefore, we use the term rainfall for the following analyses, although it should be noted that the radar-based precipitation estimates may, in a limited area, also include contributions from other precipitation types such as hail.
2.3. Data Preparation
For the analysis, we distinguish three different road section characteristics: the speed limit (50, 70, 100 and 130 km/h), the lane category (single- and multi-lane roads), and the road environment (urban and non-urban). Of these road sections, we exclude those containing tunnels, ramps and intersections. Additionally, road section with signs for persistent traffic congestion are excluded. Since no direct measurements of traffic congestion are available, we classify a road section as congested if, in more than 10% of the observed time steps, the driving speed falls below 50% of the local speed limit (see
Appendix A for a sensitivity analysis of these thresholds). Furthermore, some road sections show unusually high observed speeds (more than twice the local speed limit). These sections are also excluded, because it is possible that the speed limit information is not valid.
With an average length of around 100 m, the large majority of road sections is significantly shorter than the 1 km grid size of the RADKLIM data. Each road section is assigned the 1 km RADKLIM grid cell in which it is located. Then, the 5 min rainfall amounts are assigned to the corresponding driving speed observations. This implicitly assumes a uniform distribution of rainfall within each grid cell.
To asses the impact of 5 min rainfall amounts on driving speed, three days with heavy rainfall in different parts of Germany were selected. Maps of the 24 h rainfall amounts show that rainfall occurred in different parts of Germany on the three selected days (
Figure 2). The event with the highest rainfall amounts (20 May 2019) lead to 24 h rainfall amounts of more than 30 L/m
2 in 15% of the area, affecting large parts of Central and Southern Germany. The 24 h rainfall amount of the two other cases stayed below 30 L/m
2. On all three days, more than 3% (18,000) of the RADKLIM grid cells contained time steps with heavy short-term rainfall, during which the 5 min rainfall exceeded 5 L/m
2 in different parts of the country).
2.4. Regression Models
Regression models are developed to describe the effect of rainfall on the average driving speed v, which is the average driving speed of all vehicles contributing a measurement within a 5 min interval at a particular road section.
For v, three different linear regression models are developed, respectively. We describe v as a Gaussian random variable with expectation and variance : . First, a NULL model is developed with an intercept only, predicting simply the average driving speed and serving as a reference model; second, a categorical model CAT with , where the expectation depends on rainfall as a categorical variable, K is the number of rainfall categories, and is 1 if the rainfall is in category k and 0 otherwise; third, a model LOG with , where rainfall R is included as a continuous variable transformed with the natural logarithm. R is taken from the RADKLIM grid point closest to the center of the particular road section.
The NULL model is fitted to all time steps, including time steps with and without rainfall. Thus, the NULL model represents a reference condition, where no weather information is available. CAT and LOG models are only fitted to time steps with rainfall larger than 0 L/m2.
The relative speed difference , where denotes the average driving speed across all 5 min time steps without rainfall, is used to normalize speed changes across different types of road sections. This relative measure facilitates comparisons of rainfall impacts across roads with different speed limits and typical operating speeds. At the same time, it should be noted that similar relative speed reductions may correspond to substantially different absolute speed changes depending on the road context. For example, a given relative reduction on a high-speed motorway translates into a larger absolute decrease in speed than the same relative reduction on an urban road, potentially implying different traffic and safety-relevant conditions. Accordingly, the use of is intended to support comparability across road types, while absolute speed levels remain important for the interpretation of the results.
Regression models were estimated by ordinary least squares. Conventional standard errors assume independent and homoskedastic disturbances, an assumption that may be violated when multiple observations belong to the same road section. To account for potential within-section correlation, we additionally report standard errors clustered at the road-section level [
27]. This approach allows for arbitrary dependence of errors within road sections while preserving independence across sections, providing more reliable inference for grouped data.
2.5. Assessing Model Performance
The mean squared error
is a common metric to evaluate model performance by comparing the values predicted by the model
to the observed values
. The squared difference leads to a strong penalization of predictions with larger errors.
A skill score is a relative measure of how a model performs compared to a reference model [
28]. The mean squared error skill score
compares the score of the model under evaluation,
to the score of the reference model
. Positive values of the MSESS indicate an improvement compared to the reference model. Here, we use the NULL model as the reference. Thus, the MSESS for CAT and LOG quantifies the benefit of having rainfall as a categorical and continuous predictor, respectively.
Overfitting is an undesirable behavior in statistical modeling that occurs when the model provides accurate predictions for training data, but not for new data it has not seen during the training. This may occur, for example, if the model’s number of degrees of freedom is large compared to the number of available data points in the training data set. To detect potential overfitting in our models, cross-validation is applied by estimating model coefficients using a training data set and evaluate the performance (computing scores) on an independent testing data set. Here, we split the data based on the three available days: Parameters are estimated using data of two days and the score is calculated for the remaining day. This is repeated three times such that for each set of days, the resulting score is computed. These scores are then averaged and used for model comparison. By comparing the scores computed with and without cross-validation, we can estimate the effect of potential overfitting. The comparison of models fitted to individual days also indicates the uncertainty that arises from building the models based on a limited amount of days only, instead of using long continuous time series.
4. Discussion
In this study, we analyzed the short-term relationship between rainfall amount and driving speed using 5 min GNSS-based probe vehicle data for approximately 1.5 million road sections across Germany. The combination of floating car data with high-resolution radar-based rainfall estimates enables an assessment of rainfall-related speed reductions at a high spatial and temporal resolution. Focusing on three days with widespread heavy rainfall allowed us to capture a broad range of rainfall intensities and road characteristics. Since the results are based on three individual days, findings should be interpreted as event-based estimates rather than long-term averages.
The results indicate a non-linear, approximately logarithmic decrease in driving speed with increasing rainfall amount, which is consistent with earlier studies [
5]. Speed reductions are more pronounced on road sections with higher speed limits and on multi-lane roads, while differences between urban and non-urban roads with the same speed limit are comparatively small.
The estimated magnitudes of rainfall-related speed reductions range from approximately 2–10% under light rainfall or at low driving speeds to more than 20% under heavy rainfall and high driving speeds. These values are broadly consistent with the ranges reported in an extensive literature review by Stamos et al. [
3]. Direct numerical comparisons across studies are, however, difficult, as reported effects depend strongly on factors such as the spatio-temporal resolution of the data, the road types considered, and the way rainfall is represented in the analysis. For example, Hooper et al. [
13], who analyse precipitation effects on driving speed using radar data but distinguish only between conditions with and without precipitation, report a speed reduction of approximately 2 km/h at a mean driving speed of 80 km/h. Salvi et al. [
23] find that speed reductions during rainfall strongly depend on baseline driving speed but observe no substantial effect of rainfall amount when comparing four rainfall intensity classes. This contrasts with the results reported here and may be partly attributable to the relatively coarse spatial resolution of their radar-based rainfall data (0.125°, approximately 13 km). By contrast, Sakhare et al. [
24], using rainfall data at 3 km spatial resolution— closer to that employed in this study—find a pronounced dependence of driving speed on rainfall intensity, with reductions of 8.4% under heavy rainfall exceeding 8 mm/h.
Incorporating rainfall as a predictor improves speed prediction, reducing cross-validated mean squared error by up to 14% on average, with substantially larger improvements under heavy rainfall conditions. This finding is consistent with previous studies employing more complex machine learning approaches, such as gradient boosting and neural network models, which also report gains in predictive performance when weather information is included. However, the magnitude of these improvements varies widely across studies. For instance, Prokhorchuk et al. [
22] report an average reduction in mean absolute percentage error of 4.5% when radar-based rainfall data are incorporated. Their review of related work further indicates a broad range of reported improvements, spanning from approximately 1.5% to 25%. Again, these differences likely reflect variations in data characteristics, model structures, spatial and temporal resolution, and traffic conditions across studies. Previous studies have also shown that more complex machine learning methods often outperform traditional linear regression approaches [
22]. This suggests that evaluating additional modeling approaches for Germany would be valuable. However, doing so would require the availability of larger and more diverse datasets.
Treating rainfall as a continuous variable yields more stable results than categorical approaches, which tend to overfit in regimes with sparse observations. However, due to the scarcity of heavy rainfall observations, estimates of rainfall effects at higher intensities remain uncertain, as reflected by differences in the estimated functional relationships across individual days in the cross-validation analysis.
Several aspects of the analysis point to promising directions for future research. First, traffic volume and congestion are known to be key determinants of driving speed. While direct traffic volume data were not available at the required spatial and temporal resolution, we applied a simple filtering strategy to remove road sections affected by prolonged congestion. Short-term congestion effects are likely still present, and future studies would benefit from integrating traffic volume or occupancy data to better disentangle rainfall-related speed reductions from demand-driven congestion dynamics.
Related to this, the modeling approach was intentionally kept relatively simple to ensure interpretability and robustness across a very large and heterogeneous dataset. More complex statistical frameworks, such as mixed-effects models or models with explicit temporal correlation structures, could in principle better account for repeated observations, unobserved heterogeneity, and autocorrelation. However, the available data do not consistently provide complete time series for all road sections across all analysed days, limiting the feasibility and stability of such approaches. As a result, some degree of residual clustering and temporal dependence remains in the model errors, which can lead to underestimated standard errors. Consequently, reported confidence intervals should be interpreted conservatively. As more continuous and longer-term probe vehicle datasets become available, these modeling extensions represent a natural next step to improve inference and uncertainty quantification.
Another perspective concerns the spatial resolution of rainfall data. Rainfall amounts are assumed to be homogeneous within each 1 km radar grid cell. During convective events, however, sub-grid variability may still be substantial, potentially leading to exposure misclassification at the level of individual road sections. Such measurement error would be expected to attenuate estimated rainfall effects, suggesting that the reported speed reductions may represent conservative estimates.
In this study, only the instantaneous effect of rainfall is analyzed. Lagged effects of rainfall like driving on wet roads after a rainfall event are not considered. Furthermore, distinction of between different types of precipitation apart from rainfall, like hail, sleet or snowfall, is not possible with the RADKLIM data. Future research could address this by including novel radar-based hydrometeor classification, e.g., based on the novel Hymec product used at the German Weather Service [
29].
The GNSS-based probe data also open avenues for further refinement. While the large sample size improves representativeness at the network level, information on vehicle types is not available. Differentiating between passenger cars, trucks, and other vehicle classes could provide additional insight. Similarly, probe vehicle penetration rates may vary spatially and temporally, which should be considered when interpreting results.
While this study focuses on rainfall-induced changes in driving speed, it does not assess traffic safety outcomes such as crash occurrence, injury risk, or traffic mortality. Importantly, reductions in driving speed should not be interpreted as evidence of improved road safety. Although a speed reduction might lower crash severity compared to an unaltered driving speed, a substantial body of literature shows that rainfall in general increases crash risk and traffic casualties, reflecting complex and sometimes countervailing mechanisms such as reduced visibility, adverse road surface conditions, and altered traffic flow dynamics [
30,
31,
32,
33]. Behavioural adaptations such as speed reduction may therefore coexist with elevated accident risk. Accordingly, the results presented here describe traffic flow responses to rainfall but do not support inferences about safety impacts, which requires dedicated analyses linking weather conditions to crash and injury data.
Finally, the analysis is based on three days with heavy rainfall, which limits the ability to capture seasonal effects, weekday variability, and a robust representation of a diurnal cycle of traffic volume. Differences in model estimates across individual days, especially at high rainfall intensities, highlight the importance of larger multi-event datasets. Other studies have analysed samples comprising 12 days [
34], 42 days [
24], or 43 days [
21], as well as longer continuous observation periods spanning months [
11,
22] or years [
9]. Extending the analysis to longer time periods would reduce sampling uncertainty and allow for more robust estimation of effects under extreme conditions, albeit at the cost of increased computational complexity and the potential need to focus on selected regions.
Taken together, these perspectives indicate that the present study represents only a first step toward country-scale rainfall-aware speed models for Germany. While further data and methodological advances are required for fully operational implementations, the results demonstrate the feasibility and value of combining radar-based rainfall observations with GNSS probe vehicle data to improve our understanding of weather-related driving behavior.
5. Conclusions
Previous research has shown that rainfall reduces driving speed. Road users adapt their driving behavior due to reduced visibility and lower skid resistance, for example. Most existing studies are based on speed measurements at individual sites or at a limited number of road sections combined with station-based rainfall observations. While such approaches yield accurate local estimates, their transferability to other road types and regions is often unclear.
The aim of this paper was to combine high-resolution calibrated radar-based rainfall amounts with GNSS-derived driving speeds derived from GNSS probe data. This allowed for a more generalized estimation of the rainfall–speed relationship at a high spatial and temporal resolution. Despite some disadvantages of using GNSS probe data, which have been discussed in the previous section, the analysis has lead to promising results, which would have been difficult to obtain with traditional station-based approaches.
Using data from approximately 1.5 million road sections, we show that average rainfall-related speed reductions depend systematically on road characteristics. Relative speed reductions increase with rainfall intensity and are substantially larger on road sections with higher speed limits and on multi-lane roads. For example, in case of intense rainfall, the average speed reductions is % at speed limits of 50 km/h and % at 130 km/h. Importantly, treating rainfall as a continuous predictor allows the logarithmic functional form of the rainfall–speed relationship to be captured more parsimoniously than categorical approaches. However, estimates at very high rainfall intensities are subject to increased uncertainty due to data sparsity.
Beyond descriptive effects, we demonstrate that incorporating rainfall information yields measurable predictive benefits. Cross-validation shows that including rainfall as a predictor reduces mean squared error by approximately 14% on average, and by up to 50% during heavy rainfall conditions. This highlights the relevance of rainfall not only as an explanatory variable but also as a key input for short-term speed prediction models.
The results of this study demonstrate the potential of combining radar-based rainfall data with GNSS-derived driving speeds to characterize rainfall-related speed reductions across heterogeneous road types. However, the analysis is based on a limited number of rainfall events and does not explicitly account for traffic volume or congestion dynamics. Consequently, the models presented here should be understood only as a first step toward country-scale rainfall-aware speed prediction rather than as directly operational tools. Extending the analysis to a larger set of rainfall events, incorporating traffic volume information, and further addressing temporal dependencies will be essential to improve robustness, generalizability, and practical applicability in future work.