Exploring the Spatial-Temporal Relationship between Rainfall and Traffic Flow: A Case Study of Brisbane, Australia

The impact of inclement weather on traffic flow has been extensively studied in the literature. However, little research has unveiled how local weather conditions affect real-time traffic flows both spatially and temporally. By analysing the real-time traffic flow data of Traffic Signal Controllers (TSCs) and weather information in Brisbane, Australia, this paper aims to explore weather’s impact on traffic flow, more specifically, rainfall’s impact on traffic flow. A suite of analytic methods has been applied, including the space-time cube, time-series clustering, and regression models at three different levels (i.e., comprehensive, location-specific, and aggregate). Our results reveal that rainfall would induce a change of the traffic flow temporally (on weekdays, Saturday, and Sunday and at various periods on each day) and spatially (in the transportation network). Particularly, our results consistently show that the traffic flow would increase on wet days, especially on weekdays, and that the urban inner space, such as the central business district (CBD), is more likely to be impacted by inclement weather compared with other suburbs. Such results could be used by traffic operators to better manage traffic in response to rainfall. The findings could also help transport planners and policy analysts to identify the key transport corridors that are most susceptible to traffic shifts in different weather conditions and establish more weather-resilient transport infrastructures accordingly.


Introduction
The recent expansion of urban areas has brought the diversity of transportation modes in order to meet the increasing traffic demands in daily life, which accordingly leads to a larger amount of traffic flow on roads [1]. To deal with the various transportation modes, a deep understanding of traffic demand and performance is critical [2]. Typically, traffic information contains many attributes such as speed, flow, and density, among which traffic flow is normally considered as the essential element that can illustrate the efficiency of traffic management and control [3].
The negative influence of adverse weather on traffic situation is becoming a major concern to transportation authorities and agencies [4]. First, the external conditions, such as the roughness of the road surface that can affect the skid of vehicles, could be altered, which may influence driving behaviour and road safety accordingly. Meanwhile, the traffic pattern in adverse weather might be different in contrast to the situation in a dry weather condition [5]. Overall, inclement weather affects transportation in three aspects-demand, safety, and capacity (flow of traffic)-which has been extensively investigated in the literature [4][5][6][7][8][9]. However, many of the studies primarily focused on the fluctuation of the quantities of traffic flow that are affected by inclement weather and very few studies explored the impact of wet weather conditions on temporal and spatial patterns of traffic flow. Furthermore, although some efforts have been made to quantify the spatial-temporal patterns of real-time traffic flow [10][11][12][13], few of them associated the spatial-temporal patterns of real-time traffic flow with weather information (especially the inclement weather information).
In this paper, to investigate weather's impact on the spatial-temporal feature of traffic flow, which describes how the traffic changes in the time-space domain [14], we consider one commonly used weather parameter-precipitation, which implies the amount of rainfall and snowfall [15]. The utility of combining traffic flow with weather information is that the temporal-spatial characteristics of traffic flow under diverse weather conditions can be revealed, highlighted, and converted into distribution patterns at the spatial and temporal dimension. To fully explore these characteristics, we adopt a spatial-temporal analytic approach, which can both qualitatively visualise and quantitatively model the spatial-temporal relationship between weather (rainfall in this study) and traffic flow.
Specifically, we first explore the distribution patterns of traffic flow in different weather conditions using space-time cube and time-series clustering methods to visualise the spatial-temporal patterns of traffic flow under dry/wet weather conditions, and then, detect weather's impact on traffic flow using statistical models, and identify the time and the location at which traffic flow was impacted by weather; finally, we quantify weather's impact on traffic flow at three different levels (i.e., comprehensive, location-specific, and aggregate). The findings gained from the research can help transport planners and policy analysts to identify key transport corridors that are most susceptible to traffic flow changes in different weather conditions and establish more weather-resilient transport infrastructures accordingly.
The rest of the paper is organized as follows. Section 2 reviews relevant studies in the literature. Section 3 introduces the study region and data sources used in this study. Section 4 describes the methodologies employed in analysing the data. Section 5 presents the distribution pattern visualisation of traffic flow, the modelling analysis, and its interpretation. Finally, in Section 6 we discuss the findings, limitations, and future work.

Literature Review
The impact of inclement weather on traffic flow has been extensively investigated in the literature. In general, previous studies consistently indicated that heavy rain and snow can reduce road capacity and traffic speed, and increase vehicle crash possibility [4][5][6][7][8][9]. Maze et al. [16] found that inclement weather's impact on traffic depends on the intensity of inclement weather and the type of travelling mode. Prevedouros and Chang [17] observed that the intersection operations can be affected by wet weather conditions in three specific aspects (flow and capacity of the roads, the effective time of green light, and the progression), which may deteriorate the level of service. Agbolosu-Amison et al. [18] studied the influence of inclement weather on saturation flow, and the results show that the saturation flow has a significant correlation with inclement weather.
Due to the diversity of weather conditions in various regions at different time periods, the localized effects of weather on traffic flow may be distinct. For example, Keay and Simmonds [19] investigated the traffic data of 1989-1996 in Melbourne, Australia, and found a negative relationship between volume and rainfall amount statistically significant only for winter and spring (wet and cold). Chung et al. [20] identified the effects of rainfall on traffic counts of the Tokyo Metropolitan Expressway; with the increase in rainfall, the effects on weekends were larger than on weekdays. Unrau and Andrey [21] selected 1998's traffic data on the Gardiner Expressway (a six-lane highway access to Toronto, Canada) to determine how inclement weather would affect speed-volume relationships, and the result depicted that under daytime inclement conditions, when the traffic volumes were typically high, there would be reductions in speeds, and due to the interaction between speed and volume, the volumes would decrease. Jaroszweski and McNamara [22] studied the rainfall's effects on road accident data of 2008 to 2011 in Manchester and London, UK, and the relative accidents rates (RARs) were used to evaluate the effects. The outcome displayed that RARs would increase under inclement weather conditions in Manchester, while declining in London, which indicated that the difference might be caused by traffic volume, speed and driver behaviour in these two cities under inclement weather conditions. Meanwhile, the results of the study case [23] in Johor and Terengganu, Malaysia, illustrated that the rainfall would have impacts on traffic flow and speed, but average traffic volume.
Various statistical techniques have been adopted to identify the influence of different weather conditions on traffic flow. A traffic-weather index [9] was used to determine how traffic flow can be affected by different weather conditions in different regions of Shanghai. To determine the distribution patterns of traffic flow, a spatial-temporal autoregressive integrated moving average model was adopted by Kamarianakis and Prastacos [24], which used a weighting matrix to estimate the spatial properties of traffic flow that were observed at specific monitoring locations within the same time interval. Hou et al. [25] used weather adjustment factors to detect the impact of inclement weather on traffic flow, in which the intensity of the rain or the intensity of the snow is considered. Tao et al. [26] applied the autoregressive integrated moving average model to quantify the relationship between weather and bus ridership. Datla and Sharma [27] indicated that the regression models (e.g., linear regression, logistic regression) were appropriate for detecting the relationship between weather and traffic flow.
In summary, on the one hand, most studies related to weather's impact on traffic flow mainly target the total traffic flow change instead of traffic flow's temporal-spatial patterns. On the other hand, although there are some efforts on quantifying the spatial and temporal patterns of the real-time traffic flow, few of them associate the real-time traffic flow with weather information both spatially and temporally. Theofilatos and Yannis [15] suggested that the joint research of combining real-time traffic flow data with weather data rather than analysing them separately was necessary. In this study, we use qualitative and quantitative approaches to fill this gap.

Study Region
The study context is Brisbane, the capital city of Queensland, and the third biggest city in Australia, with an estimated population of 2.5 million in 2018. Brisbane has a humid subtropical climate, in which the summer is wet and hot, while the winter is dry and warm. The lowest average temperature is 16.6 • C and the highest average temperature is 26.6 • C, annually [28]. The transportation network system in Brisbane is comprehensive, which connects regional centres and interstates. A small portion (8.5%) of the total trips is completed by using public transport (such as bus, ferry, and rail), whereas most trips (91.5%) are carried by private vehicles [29]. The CBD of Brisbane is in a peninsula that distributes along the Brisbane River and is the central hub for the public transportation system, which includes bus, ferry, and train services [28].

Data Collection
In this paper, traffic data were obtained from Brisbane City Council (BCC) via an Application Programming Interface (API). Due to the large volume of real-time traffic flow data, a traditional data processing approach (such as using hardware storage) cannot efficiently harvest, store, and analyse such big data [30]. Therefore, in this paper, we adopt a data processing procedure similar to the Real-time Traffic Information system, consisting of four parts-collecting, processing, analysing, and publishing real-time traffic data [31]. As shown in Figure 1, the data processing procedure includes four steps, which are capturing, storing, cleaning, and outputting data accordingly.
Sustainability 2020, 12, x FOR PEER REVIEW  3 of 24 and Terengganu, Malaysia, illustrated that the rainfall would have impacts on traffic flow and speed, but average traffic volume. Various statistical techniques have been adopted to identify the influence of different weather conditions on traffic flow. A traffic-weather index [9] was used to determine how traffic flow can be affected by different weather conditions in different regions of Shanghai. To determine the distribution patterns of traffic flow, a spatial-temporal autoregressive integrated moving average model was adopted by Kamarianakis and Prastacos [24], which used a weighting matrix to estimate the spatial properties of traffic flow that were observed at specific monitoring locations within the same time interval. Hou et al. [25] used weather adjustment factors to detect the impact of inclement weather on traffic flow, in which the intensity of the rain or the intensity of the snow is considered. Tao et al. [26] applied the autoregressive integrated moving average model to quantify the relationship between weather and bus ridership. Datla and Sharma [27] indicated that the regression models (e.g., linear regression, logistic regression) were appropriate for detecting the relationship between weather and traffic flow.
In summary, on the one hand, most studies related to weather's impact on traffic flow mainly target the total traffic flow change instead of traffic flow's temporal-spatial patterns. On the other hand, although there are some efforts on quantifying the spatial and temporal patterns of the realtime traffic flow, few of them associate the real-time traffic flow with weather information both spatially and temporally. Theofilatos and Yannis [15] suggested that the joint research of combining real-time traffic flow data with weather data rather than analysing them separately was necessary. In this study, we use qualitative and quantitative approaches to fill this gap.

Study Region
The study context is Brisbane, the capital city of Queensland, and the third biggest city in Australia, with an estimated population of 2.5 million in 2018. Brisbane has a humid subtropical climate, in which the summer is wet and hot, while the winter is dry and warm. The lowest average temperature is 16.6 °C and the highest average temperature is 26.6 °C, annually [28]. The transportation network system in Brisbane is comprehensive, which connects regional centres and interstates. A small portion (8.5%) of the total trips is completed by using public transport (such as bus, ferry, and rail), whereas most trips (91.5%) are carried by private vehicles [29]. The CBD of Brisbane is in a peninsula that distributes along the Brisbane River and is the central hub for the public transportation system, which includes bus, ferry, and train services [28].

Data Collection
In this paper, traffic data were obtained from Brisbane City Council (BCC) via an Application Programming Interface (API). Due to the large volume of real-time traffic flow data, a traditional data processing approach (such as using hardware storage) cannot efficiently harvest, store, and analyse such big data [30]. Therefore, in this paper, we adopt a data processing procedure similar to the Realtime Traffic Information system, consisting of four parts-collecting, processing, analysing, and publishing real-time traffic data [31]. As shown in Figure 1, the data processing procedure includes four steps, which are capturing, storing, cleaning, and outputting data accordingly.

Data Sources
To distinguish the distribution patterns of traffic flow under different weather conditions, we selected traffic flow data from an open database of BCC [32] and weather information archived at the University of Queensland as the principal data sources, as shown in Table 1. Note that Wei et al. [33] showed the four weather stations in Brisbane ( Figure 2) were highly correlated. Therefore, we can use the weather data at the University of Queensland (UQ) to represent the whole city. In total, we had access to 856 TSCs, and the weather station in UQ records 12 weather parameters, from which three parameters important for road traffic, temperature ( • C), rainfall (mm) and wind speed (km/h), were selected. We obtained traffic flow data from TSC that records traffic flows every one minute and then, we aggregated the traffic flow data for every 15 min. The 15 min time resolution of traffic flow data is reasonable for our study because it can filter out random fluctuations in traffic flow rate but still capture meaningful changes [6]. The weather information at the weather station is updated every one minute, and to make the time resolutions of these two data sources consistent, we averaged the weather parameters for every 15 min. The spatial distributions of TSCs and weather stations are illustrated in Figure 2.

Data Sources
To distinguish the distribution patterns of traffic flow under different weather conditions, we selected traffic flow data from an open database of BCC [32] and weather information archived at the University of Queensland as the principal data sources, as shown in Table 1. Note that Wei et al. [33] showed the four weather stations in Brisbane ( Figure 2) were highly correlated. Therefore, we can use the weather data at the University of Queensland (UQ) to represent the whole city. In total, we had access to 856 TSCs, and the weather station in UQ records 12 weather parameters, from which three parameters important for road traffic, temperature (°C), rainfall (mm) and wind speed (km/h), were selected. We obtained traffic flow data from TSC that records traffic flows every one minute and then, we aggregated the traffic flow data for every 15 min. The 15 min time resolution of traffic flow data is reasonable for our study because it can filter out random fluctuations in traffic flow rate but still capture meaningful changes [6]. The weather information at the weather station is updated every one minute, and to make the time resolutions of these two data sources consistent, we averaged the weather parameters for every 15 min. The spatial distributions of TSCs and weather stations are illustrated in Figure 2.   We collected the raw traffic flow data and weather data in Brisbane for two months (September and October 2018) and selected 12 days' data based on data completeness [35] and the coverage of different weather conditions and day types (shown in Table 1). Note that Saturday and Sunday are regarded as two distinct day types, because travel activities on these two days are often different from each other, as suggested in the literature [26]. Figure 3 depicts the daily patterns of three weather parameters (temperature, wind, and rainfall). The average values of temperature and wind speed are 17.9 • C and 6.5 Km/h, respectively, on wet weekdays, while 18.6 • C and 6.7 Km/h on dry days. On the wet Saturday, the daily rainfall is 2.17 mm, with an average temperature of 19.5 • C and an average wind speed of 5.0 Km/h. On the dry Saturday, the average wind speed and temperature are 7.7 Km/h and 23.5 • C, respectively. On the dry Sunday, the average wind speed is 5.6 km/h and the average temperature is 18.7 • C.  We collected the raw traffic flow data and weather data in Brisbane for two months (September and October 2018) and selected 12 days' data based on data completeness [35] and the coverage of different weather conditions and day types (shown in Table 1). Note that Saturday and Sunday are regarded as two distinct day types, because travel activities on these two days are often different from each other, as suggested in the literature [26]. Figure 3 depicts the daily patterns of three weather parameters (temperature, wind, and rainfall). The average values of temperature and wind speed are 17.9 °C and 6.5 Km/h, respectively, on wet weekdays, while 18.6 °C and 6.7 Km/h on dry days. On the wet Saturday, the daily rainfall is 2.17 mm, with an average temperature of 19.5 °C and an average wind speed of 5.0 Km/h. On the dry Saturday, the average wind speed and temperature are 7.7 Km/h and 23.5 °C, respectively. On the dry Sunday, the average wind speed is 5.6 km/h and the average temperature is 18.7 °C.

Methodology
To understand weather's impact on traffic flow qualitatively and quantitatively, more specifically, rainfall's impact on traffic flow, we use various visualisation techniques to directly illustrate any potential weather impact on traffic flow and use statistical methods to detect the relationship between weather and traffic flow. These techniques and methods are introduced below.

Traffic Pattern Visualisation
To illustrate the influence of weather variables on the spatial-temporal distribution of traffic flow, two visualisation techniques, i.e., space-time cube and time-series clustering, are used to visualise the traffic flow patterns at each TSC under different weather conditions, as elaborated below. •

Space-Time Cube
Space-time cube [36] is defined as a three-dimensional array , where x, y are the location coordinates and t is the trip departure time, as shown in Figure 4. In this paper, x and y represent the coordinates of each TSC, and t is the time resolution of traffic flow data, which is 15 min by default. The value in each space-time cube is the total traffic flow within 15 min at each TSC on each day. Once a space-time cube in different weather conditions on a specific day is generated, the spatial-temporal distribution of traffic flow is extracted and visualised based on the predefined space-time cube using time-series clustering, which is introduced below.

Methodology
To understand weather's impact on traffic flow qualitatively and quantitatively, more specifically, rainfall's impact on traffic flow, we use various visualisation techniques to directly illustrate any potential weather impact on traffic flow and use statistical methods to detect the relationship between weather and traffic flow. These techniques and methods are introduced below.

Traffic Pattern Visualisation
To illustrate the influence of weather variables on the spatial-temporal distribution of traffic flow, two visualisation techniques, i.e., space-time cube and time-series clustering, are used to visualise the traffic flow patterns at each TSC under different weather conditions, as elaborated below. •

Space-Time Cube
Space-time cube [36] is defined as a three-dimensional array , where x, y are the location coordinates and t is the trip departure time, as shown in Figure 4. In this paper, x and y represent the coordinates of each TSC, and t is the time resolution of traffic flow data, which is 15 min by default. The value in each space-time cube is the total traffic flow within 15 min at each TSC on each day. Once a space-time cube in different weather conditions on a specific day is generated, the spatial-temporal distribution of traffic flow is extracted and visualised based on the predefined spacetime cube using time-series clustering, which is introduced below. •

Time-Series Clustering
Time-series clustering [38] uses K-medoids to classify the time-series clusters in which the medoid (one particular TSC in our study) is the centre of a cluster and has the minimum dissimilarity between itself and other TSCs within the same cluster [39]. Using this method, TSCs can be classified into different clusters.

Statistical Models
To quantitatively understand weather's impact on traffic flow, statistical models are developed to identify the relationship between inclement weather and traffic flow.
Based on the results from the two visualisation techniques introduced above, we first use traffic flow as the dependent variable and weather conditions as the independent variables to directly explore weather's impact on traffic flow using linear regression analysis. Then, we use traffic flow time-series clusters as the dependent variable and weather conditions as the independent variables to further explore weather's impact on the change of traffic flow trend. Since traffic flow time-series clusters are categorical, linear regression analysis is not suitable. Thus, logistic regression analysis is used, instead. Moreover, traffic flow time-series clusters are ranked, and weather's impact on their ranking is modelled using ordered logistic regression. These statistical models are introduced below. •

Time-Series Clustering
Time-series clustering [38] uses K-medoids to classify the time-series clusters in which the medoid (one particular TSC in our study) is the centre of a cluster and has the minimum dissimilarity between itself and other TSCs within the same cluster [39]. Using this method, TSCs can be classified into different clusters.

Statistical Models
To quantitatively understand weather's impact on traffic flow, statistical models are developed to identify the relationship between inclement weather and traffic flow.
Based on the results from the two visualisation techniques introduced above, we first use traffic flow as the dependent variable and weather conditions as the independent variables to directly explore weather's impact on traffic flow using linear regression analysis. Then, we use traffic flow time-series clusters as the dependent variable and weather conditions as the independent variables to further explore weather's impact on the change of traffic flow trend. Since traffic flow time-series clusters are categorical, linear regression analysis is not suitable. Thus, logistic regression analysis is used, instead. Moreover, traffic flow time-series clusters are ranked, and weather's impact on their ranking is modelled using ordered logistic regression. These statistical models are introduced below. •

Linear Regression (LR) Model
LR can model the relationship between a response variable and a set of independent variables, and has been used to model the relationship between weather and traffic flow in the literature (e.g., reference [27]). A general form of the linear regression model considered in our study is shown below, where y i is the traffic flow of the ith observation, x m,i is the mth independent variable (e.g., daily rainfall, day of the week or TSCs categories) for the ith observation, and β m is the regression coefficient for the mth independent variable. •

Multiple Logistic Regression (MLR) Model
MLR is a statistical model that can be used to observe how the independent variables may affect the dependent variables when the dependent variables are nominal and its outcomes are more than two [40]. For m possible outcomes (e.g., time-series cluster ranking in this study), when the outcome m − 1 is selected as the pivot or reference, totally m − 1 independent binary logistic regression models would be calculated, as illustrated below: where β m,N is the regression coefficient for the Nth independent variable when the outcome is m, x m,N,i is Nth independent variable (e.g., daily rainfall, day of the week or TSCs categories) of the ith observation when the outcome is m. •

Ordered Logistic Regression (OLR) Model
OLR is a statistical model that can detect the impact from independent variables on the dependent variables when the dependent variables are ordinal. In this study, traffic flow time-series clusters are further ranked into three categories-low, moderate, and high. OLR is adopted to model weather's impact on such ranking.
The equation of OLR can be defined as follows: where m is the ordinal response, α is the endpoint that can set the continuous scale for Y * , Y * is the continuous variables (unobserved) that belong to the observed dependent variables (e.g., time-series cluster ranking in this study), Y * = Xβ + ε, X is the vector of independent variables, β is the vector of coefficients, and ε is the error term. For c possible outcomes, totally c − 1 independent binary logistic regression models would be estimated, and according to the proportional odds assumption, the coefficients of all these logistic regression models are the same [40]. In modelling ordinal dependent variables (for example, time-series cluster ranking in this study), the logit transformation is applied to the cumulative probabilities for maintaining the category order, as shown in the equation below.
A typical model for the cumulative logits is: c is the total number of categories; x 1 , x 2 , . . . , x n are n explanatory variables; β 1 , β 2 , . . . , β n are corresponding coefficients.
More information on ordered logistic regression can be found in references [40,41]. •

Confusion Matrix
To comprehensively assess the performance of MLR and OLR, the confusion matrix is used. More specifically, the confusion matrix consists of True Positive, False Positive, False Negative, and True Negative, as defined below [42]: True positive (TP): The actual and predicted cluster categories are the same for any cluster category. False positive (FP): The actual outcome does not belong to a specific cluster category, but the predicted outcome belongs to a specific cluster category.
False negative (FN): The actual outcome belongs to a specific cluster category, but the predicted outcome does not belong to a specific cluster category.
True negative (TN): The actual and predicted outcomes do not belong to a specific cluster category. Based on TP, FP, TN, and FN, some additional indicators can be calculated: Sensitivity: The indicator to measure the performance of a prediction model about whether the model can predict the outcome in a specific cluster category correctly when the actual outcome belongs to that specific cluster category, which is defined as: Specificity: The indicator to value how good the prediction model is incorrectly determining the cluster categories, and is defined as: When interpreting the results, it is not reliable to rely on sensitivity. Both a high value of sensitivity and a high specificity are necessary for a good prediction model.

Positive Predictive Value (PPV):
A ratio of the number of true positive values to the total number of positive values (TP + FP), which represents the portion of the positive values that have been predicted correctly, and is defined as: Negative Predictive Value (NPV): A ratio of the number of true negative values to the total number of negative values (TP + FP), which represents the portion of the negative values that have been predicted correctly, and is defined as: Positive and negative predictive values would change when the numbers of actual observations in a specific cluster category change.
Accuracy: The prediction accuracy is defined as:

Results
In this section, we apply the proposed methods to explore the spatial-temporal relation between traffic flow and weather conditions using the data collected in Brisbane. Two parts are presented in this section-one is the visualisation results of weather's impact on traffic flow, and the other is the modelling results for the relationship between weather and traffic flow.

Visualisation of Traffic Flow Pattern
We first process the total traffic flows of the whole study region. We aggregate traffic flow every 15 min at multiple TSCs for each day, then determine how the traffic flow trend changes at each day across different periods. Finally, by comparing plots of weather data with traffic flow trends, the relationship between traffic flow and weather conditions is visually represented. Figure 5a depicts the total aggregated traffic flow over the target days. This figure shows that generally, the traffic flow on weekday reaches peak points during the morning peak hours (around 7:30 a.m. to 8:30 a.m.) and the afternoon peak hours (around 4:30 p.m. to 5:45 p.m.), regardless of weather conditions. In addition, a similar trend can be found on Saturday and Sunday, although the peak hours of the weekends occur near noon (around 11:00 a.m. to 1:00 p.m.). However, from this figure, the trending of traffic flow on each target date can be observed, no obvious difference of traffic flow under various weather conditions can be detected. Combining with Figure 5b, we can visually identify that the aggregated weekday traffic flow in the wet weather condition is generally higher than that on the dry weekday, but no clear trend can be observed for Saturday and Sunday.
Sustainability 2020, 12, x FOR PEER REVIEW 9 of 24 Figure 5a depicts the total aggregated traffic flow over the target days. This figure shows that generally, the traffic flow on weekday reaches peak points during the morning peak hours (around 7:30 a.m. to 8:30 a.m.) and the afternoon peak hours (around 4:30 p.m. to 5:45 p.m.), regardless of weather conditions. In addition, a similar trend can be found on Saturday and Sunday, although the peak hours of the weekends occur near noon (around 11:00 a.m. to 1:00 p.m.). However, from this figure, the trending of traffic flow on each target date can be observed, no obvious difference of traffic flow under various weather conditions can be detected. Combining with Figure 5b, we can visually identify that the aggregated weekday traffic flow in the wet weather condition is generally higher than that on the dry weekday, but no clear trend can be observed for Saturday and Sunday.  We first adopt time-series clustering to categorize all TSCs' traffic flow in terms of the traffic amount while considering three day types. Figure 6 demonstrates the TSC traffic flow profile on different day types, which can be grouped into three clusters (traffic flow change patterns of the low traffic flow group, moderate traffic flow group, and high traffic flow group) for both wet and dry days with the similar traffic pattern. Specifically, the morning peak hours (7:30 a.m. to 8:30 a.m.) and evening peak hours (4:30 p.m. to 5:45 p.m.) on weekdays are in line with the results from the total aggregated traffic flow trends in Figure 5, while the only peak hours appear on both Saturday and We first adopt time-series clustering to categorize all TSCs' traffic flow in terms of the traffic amount while considering three day types. Figure 6 demonstrates the TSC traffic flow profile on different day types, which can be grouped into three clusters (traffic flow change patterns of the low traffic flow group, moderate traffic flow group, and high traffic flow group) for both wet and dry days with the similar traffic pattern. Specifically, the morning peak hours (7:30 a.m. to 8:30 a.m.) and evening peak hours (4:30 p.m. to 5:45 p.m.) on weekdays are in line with the results from the total aggregated traffic flow trends in Figure 5, while the only peak hours appear on both Saturday and Sunday. Moreover, no significant changes can be detected when visually comparing the traffic flow trend for the time-series cluster under wet and dry conditions.  We then apply time-series clustering to further explore the changing rate (The equation for calculating the changing rate: ρ = a−b b × 100% where ρ is the changing rate (relative to dry days), a is the traffic flow on wet days, and b is the traffic flow on dry days) of traffic flow on three day types (Figure 7). It is observed that three clusters can be classified on each day type. In particular, the higher changing rate appears at two periods on weekdays (12:00 a.m. to 5:00 a.m. and 8:00 p.m. to 12:00 p.m.), while it normally occurs on Saturday between 10 a.m. and 6 p.m., and Sunday between 6 a.m. and 8 p.m., respectively. Based on the changing rate profile of each cluster, it is reasonable to classify cluster 1 as stable, cluster 2 as slightly fluctuant, and cluster 3 as fluctuant (see Figure 7), accordingly. where is the changing rate (relative to dry days), a is the traffic flow on wet days, and b is the traffic flow on dry days) of traffic flow on three day types (Figure 7). It is observed that three clusters can be classified on each day type. In particular, the higher changing rate appears at two periods on weekdays (12:00 a.m. to 5:00 a.m. and 8:00 p.m. to 12:00 p.m.), while it normally occurs on Saturday between 10 a.m. and 6 p.m., and Sunday between 6 a.m. and 8 p.m., respectively. Based on the changing rate profile of each cluster, it is reasonable to classify cluster 1 as stable, cluster 2 as slightly fluctuant, and cluster 3 as fluctuant (see Figure 7), accordingly.   The corresponding spatial distributions of the time-series traffic flow clusters (traffic flow change patterns) are shown in Figure 8. It is found that on weekdays and Sundays, there is no significant difference in the spatial distribution of time-series clusters between the dry and wet weather conditions. On Saturday, however, in the wet weather, most TSCs in the CBD area are classified into cluster 1 (traffic flow change patterns of the low traffic flow group), while in the dry weather, most TSCs in CBD areas are labelled as cluster 2 (traffic flow change patterns of the moderate traffic flow group).
Sustainability 2020, 12, x FOR PEER REVIEW 12 of 24 The corresponding spatial distributions of the time-series traffic flow clusters (traffic flow change patterns) are shown in Figure 8. It is found that on weekdays and Sundays, there is no significant difference in the spatial distribution of time-series clusters between the dry and wet weather conditions. On Saturday, however, in the wet weather, most TSCs in the CBD area are classified into cluster 1 (traffic flow change patterns of the low traffic flow group), while in the dry weather, most TSCs in CBD areas are labelled as cluster 2 (traffic flow change patterns of the moderate traffic flow group). The time-series clusters of changing rates at each TSC are displayed spatially in Figure 9a, which suggests that traffic flow on weekdays fluctuates more significantly, while it is relatively stable on Sunday. By implementing a tool named Combinatorial Or [43] in ArcGIS, we further clustered the shifting patterns of traffic flow at each TSC into three clusters-Increased (TSCs shift from the low traffic flow cluster on dry days to the high traffic flow cluster on wet days), Decreased (TSCs shift from the high traffic flow cluster on dry days to the low traffic flow cluster on wet days), and No Change (TSCs belong to the same cluster class on both dry and wet days), as shown in Figure 9b. It is clear that more TSCs on Saturday have the "Decreased" shifting situation, while on Sunday, more TSCs belong to "Increased" class. In the modelling section, we use these TSCs' classes to determine the impact of the spatial character of TSCs on traffic flows. The time-series clusters of changing rates at each TSC are displayed spatially in Figure 9a, which suggests that traffic flow on weekdays fluctuates more significantly, while it is relatively stable on Sunday. By implementing a tool named Combinatorial Or [43] in ArcGIS, we further clustered the shifting patterns of traffic flow at each TSC into three clusters-Increased (TSCs shift from the low traffic flow cluster on dry days to the high traffic flow cluster on wet days), Decreased (TSCs shift from the high traffic flow cluster on dry days to the low traffic flow cluster on wet days), and No Change (TSCs belong to the same cluster class on both dry and wet days), as shown in Figure 9b. It is clear that more TSCs on Saturday have the "Decreased" shifting situation, while on Sunday, more TSCs belong to "Increased" class. In the modelling section, we use these TSCs' classes to determine the impact of the spatial character of TSCs on traffic flows.  In summary, compared to the dry weather conditions, the traffic flow increases under wet weather on weekdays and Sunday, while declining on Saturday (see Figure 9b). In wet weather conditions, vehicles normally run slowly due to the slippery road surface, which may lead to a declining capacity of the public transport system to carry passengers. In addition, the speed awareness monitors deployed across the city [44] may also change the speed limit to slow down driving speed. On weekdays and Sundays, people have to travel for certain compulsory activities (e.g., working schedule and religious activities on Sunday mornings) despite the weather conditions, that is, the traffic demand is fixed in this situation. The possibility of taking private vehicles or commercial vehicles would be higher for passengers to avoid delay and to reduce the discomfort level caused by inclement weather. This can also be partially explained by the findings [45] that under inclement weather conditions, the travel distance would reduce, except the trips with commuting purposes, and the proportion of people choosing walking and biking would also decrease. As for on Saturday, people may choose to stay at home and some outdoor activities might be cancelled due to inclement weather, therefore, the demand for extra vehicles would reduce.
Overall, the above results from using the visualisation techniques reveal both the spatial and temporal distribution of traffic flow at each TSC considering weather impact. Next, we investigate how daily rainfall impacts traffic flow at each TSC using statistical modelling methods.

Modelling Weather's Impact on Traffic Flow
To thoroughly investigate weather's impact on traffic flow, we develop three levels of statistical analysis in this section: first the comprehensive level, then the location-specific level, and finally, the aggregate level, as explained below in detail. By doing so, consistent results on rainfall's impact on traffic flow can be convincingly obtained. In summary, compared to the dry weather conditions, the traffic flow increases under wet weather on weekdays and Sunday, while declining on Saturday (see Figure 9b). In wet weather conditions, vehicles normally run slowly due to the slippery road surface, which may lead to a declining capacity of the public transport system to carry passengers. In addition, the speed awareness monitors deployed across the city [44] may also change the speed limit to slow down driving speed. On weekdays and Sundays, people have to travel for certain compulsory activities (e.g., working schedule and religious activities on Sunday mornings) despite the weather conditions, that is, the traffic demand is fixed in this situation. The possibility of taking private vehicles or commercial vehicles would be higher for passengers to avoid delay and to reduce the discomfort level caused by inclement weather. This can also be partially explained by the findings [45] that under inclement weather conditions, the travel distance would reduce, except the trips with commuting purposes, and the proportion of people choosing walking and biking would also decrease. As for on Saturday, people may choose to stay at home and some outdoor activities might be cancelled due to inclement weather, therefore, the demand for extra vehicles would reduce.
Overall, the above results from using the visualisation techniques reveal both the spatial and temporal distribution of traffic flow at each TSC considering weather impact. Next, we investigate how daily rainfall impacts traffic flow at each TSC using statistical modelling methods.

Modelling Weather's Impact on Traffic Flow
To thoroughly investigate weather's impact on traffic flow, we develop three levels of statistical analysis in this section: first the comprehensive level, then the location-specific level, and finally, the aggregate level, as explained below in detail. By doing so, consistent results on rainfall's impact on traffic flow can be convincingly obtained.

Comprehensive Level
At this level, we develop statistical models to quantitatively investigate how traffic flow is affected by potential important factors, such as day type, weather condition (daily rainfall), and the TSC classification. Specifically, two dependent variables are considered: the total traffic flow at each TSC on each day (TOTAL_FLOW) and the classifications of time-series cluster (Cluster_ID, Cluster_ID = 1, 2, 3). Independent variables include the day type (DAY_OF_WEEK), which is classified into Weekday = 1, Saturday = 2, Sunday = 3, daily rainfall (RAIN_ACC), and the TSC classification. According to the results in Section 5.1, we can categorize TSCs in two ways ( Figure 9): (1) classifying the TSCs in terms of the clusters of traffic flow changing rate at each TSC location (Stable = 1, Slightly Fluctuant = 2, Fluctuant = 3), which is denoted as TSC_CLASS_A, and (2) classifying the TSCs in terms of the changing direction of the clusters (Increased = 1, Decreased = 2, No change = 3), which is denoted as TSC_CLASS_B. We apply LR to model the total traffic flow of TSC and use MLR/OLR to model classifications of time-series cluster, respectively. To compare the models with these two different classifications of TSC, we apply the Akaike Information Criterion (AIC), which reflects the relative quality of statistical models for a given set of data. The best models using LR, MLR, and OLR are summarized in Table 2; it is noteworthy that we select 70% of the raw data as the training data to calibrate the MLR/OLR model and the remaining 30% of the raw data as the testing data to validate the model.  Table 3 illustrates the summary statistics of the variables that are adopted in the LR model (Model_2) and MLR model (Model_3)/OLR model (Model_5); the same variables are included in the MLR and OLR models. We use variance inflation factors (VIF) to detect the multicollinearity, and the results (Table 4) show that the VIF values of all the independent variables are less than 5.0, which indicates no existence of multicollinearity has been detected [10] and simultaneous involvement of these independent variables would not affect the analytic results of the models.
The values of estimated parameters of the LR model (Model_2) are listed in Table 5, which shows rainfall (RAIN_ACC) has a significant impact on the total traffic flow. When holding other factors constant, a one unit increase in rainfall (RAIN_ACC) would lead to an increase of 861 vehicles to the total traffic flow. Meanwhile, day types also have significant impacts on the total traffic flow. More specifically, the amount of total traffic flow would decline on Saturday and Sunday, when compared with weekdays. In addition, the total traffic flow would increase among the TSCs where the shifting pattern of traffic flow belongs to the "Decrease" category (TSC_CLASS_B = 2), relative to TSCs in the "Increase" category. The impact of TSC_CLASS_B = 3 (p-value > 0.05) on the total traffic flow is marginally significant. All the parameters that used in the MLR model (Model_3) are summarized in Table 6, which shows that all the variables that are used to calculate the odds of time-series clusters being classified into traffic flow change patterns of the low traffic flow group instead of the moderate traffic flow group are statistically significant. When controlling for other factors, with a one unit increase in the rainfall (RAIN_ACC), the odds of time-series clusters being classified into traffic flow change patterns of the low traffic flow group are 26% (i.e., (exp(0.235)−1) = 0.26) higher than the odds of being classified into traffic flow change patterns of the moderate traffic flow group. Meanwhile, for Saturday and Sunday, the odds of time-series clusters that can be classified into traffic flow change patterns of the low traffic flow group are 64% (i.e., (exp(0.495)−1) = 0.64) and 25% (i.e., (exp(0.224)−1) = 0.25) larger, respectively, than the odds of the clusters being classified into traffic flow change patterns of the moderate traffic flow group, relative to weekdays. Finally, for time-series clusters at TSCs with slightly fluctuant traffic flow changing rate (TSC_CLASS_A = 2) and time-series clusters in TSCs with large fluctuant traffic flow changing rate (TSC_CLASS_A = 3), the odds of clusters being categorized into traffic flow change patterns of the low traffic flow group are more than five (i.e., (exp(1.868)−1) = 5.48) and more than nine (i.e., (exp(2.349)−1) = 9.48) times greater, respectively, than the odds of the clusters being classified into traffic flow change patterns of the moderate traffic flow group, relative to the clusters in TSCs with stable traffic flow changing rate (TSC_CLASS_A = 1), which basically implies that moderate traffic flow is often stable with a small changing rate.     For the MLR model, from the effect plot of RAIN_ACC and DAY_OF_WEEK (Figure 10a), it is obvious that the possibilities of clusters being classified into traffic flow change patterns of the low traffic flow group are slightly higher than into the moderate traffic flow group on the dry weekdays. When the rainfall increases, the possibilities of time-series clusters being categorised into traffic flow change patterns of the low traffic flow group on three day types (weekdays, Saturday, and Sunday) would rise, while the odds of the clusters being categorised into traffic flow change patterns of the moderate and high traffic flow group would both drop. Overall, the possibility for the cluster trend to be categorised into traffic flow change patterns of the high traffic flow group is low, no matter on the dry or wet days. The similar changing patterns of possibilities happen on dry and wet Saturdays and Sundays. Among all these three days' types, the possibility of time-series cluster being classified into traffic flow change patterns of the low traffic flow group on Saturdays (both on wet and dry) is the highest.
The effect plot of RAIN_ACC and TSC_CLASS_A is shown in Figure 10b.        Figure 11. Variable Effect Plot of the OLR model. Next, we cross-validate the performances of Model_3 and Model_5 by using the testing data and the confusion matrix. Table 8 summarizes the prediction indicators for MLR and OLR models. It is observed that the prediction accuracy for traffic flow change patterns of the low traffic flow group is the best among all three clusters for both MLR and OLR models. Overall, both MLR and OLR models can reasonably predict each cluster, and the prediction accuracy for each cluster is over 50%.

Location-Specific Level
At this level, we test whether rainfall has a significant impact on traffic flow at fixed locations. From the analysis above, it is evident that TSC location can have a notable impact on the relationship between weather conditions and traffic flow. To derive more insight, we develop a linear regression model for each TSC to reveal the spatial weather impact on traffic flows. In total, 877 models are developed. Amongst these models, for 266 TSCs, rainfall appears to be a significant factor (p-value < 0.05). For these TSCs, where rainfall has a significant impact on traffic flow, a tool called inverse distance weighted in ArcGIS [26] is utilized to visualise the impact of the rainfall on traffic flow at each TSC, based on the coefficient of rainfall in the linear regression model (as depicted in Figure 12). From Figure 12, overall, these models consistently show that as the rainfall increases, traffic flow at each TSC tends to increase; moreover, the traffic flows of TSCs in the inner districts such as Kangaroo Point, Coorparoo, Fortitude Valley, and Spring Hill are more likely to be affected by the rainfall than those in the suburbs.

Aggregate Level
To further confirm the conclusions drawn from location-specific (TSC-controlled) analysis so far, we also modelled the impact of rainfall on traffic flow at the aggregate level across all the locations, where the dependent variable is the total traffic flow within 15 min on each day (SUM_FLOW), the independent variables are rainfall data within 15 min (RAIN_ACC_15min), and the day of data captured (DAY_OF_WEEK), respectively. The summary statistics of the variables can be identified from Table 9 and based on the total 1152 observations, we obtain the LR model (Table 10). In this model, all the variables (p < 0.05) have significant correlations with traffic flow, which is in line with the results above. Particularly, a positive impact of rainfall on the total traffic flow is also revealed, which is consistent with our conclusion drawn at location-specific level and the comprehensive level.

Aggregate Level
To further confirm the conclusions drawn from location-specific (TSC-controlled) analysis so far, we also modelled the impact of rainfall on traffic flow at the aggregate level across all the locations, where the dependent variable is the total traffic flow within 15 min on each day (SUM_FLOW), the independent variables are rainfall data within 15 min (RAIN_ACC_15min), and the day of data captured (DAY_OF_WEEK), respectively. The summary statistics of the variables can be identified from Table 9 and based on the total 1152 observations, we obtain the LR model (Table 10). In this model, all the variables (p < 0.05) have significant correlations with traffic flow, which is in line with the results above. Particularly, a positive impact of rainfall on the total traffic flow is also revealed, which is consistent with our conclusion drawn at location-specific level and the comprehensive level.  Note: For the independent variable, DAY_OF_WEEK = 1 is selected as the reference category.

Discussion and Conclusions
By utilizing space-time cube, time-series clustering, and statistical modellings, we have qualitatively and quantitatively analysed rainfall's impact on traffic flow spatially and temporally, using the TSC datasets in Brisbane, Australia, as a case study.
The main contribution of this study is that the spatial-temporal impact of inclement weather conditions on traffic flow has been consistently detected using the visual detections and modellings at different levels. Some findings regarding the fundamental relation among traffic flow, traffic flow change pattern, weather conditions, day types, and spatial distributions of traffic flow are concluded and summarized below. A positive impact of rainfall on the total traffic flow is also revealed, which is consistent with our conclusion drawn at location-specific level and the comprehensive level.
From the qualitative approach using spatial-temporal cube and time-series clustering, we have shown the difference of the spatial-temporal patterns of traffic flow and the distribution of traffic flow change pattern under various weather conditions on different day types. To gain more insights on the relationship between the spatial-temporal patterns of traffic flow and the rainfall, a series of statistical models have been developed.
From the quantitative approach using statistical models, rainfall's impact on traffic flow is consistently detected from the models at three different levels. First, at the comprehensive level (i.e., all the important factors are simultaneously considered in a single model, such as day type, daily rainfall, and location), it is noticeable that rainfall, day types, and locations are all significantly correlated to traffic flow and change patterns of traffic flow. Generally, traffic flow on weekdays is the highest, while it is low on Sunday under both dry and wet weather conditions; traffic flow in wet weather normally increases compared with that in dry weather conditions; the traffic flow at the locations where the change patterns of traffic flow shift from the high traffic flow patterns on dry days to the low traffic flow patterns on wet days is higher than the traffic flow at the locations with the opposite shifting direction of traffic flow change patterns.
Meanwhile, when it comes to the traffic flow change patterns, the total traffic flows on dry days and wet days at locations are classified into three groups (dry/wet day low traffic flow group, dry/wet day moderate traffic flow group, dry/wet day high traffic flow group), respectively, the traffic flow at each location (on dry/wet days) has been categorized into one specific traffic flow group, and the related traffic flow change pattern can be detected. The classification results show that the amount of traffic flow in each classified traffic flow group on wet days is higher than the corresponding group on dry days. Furthermore, rainfall has a notable impact on traffic flow change patterns, as summarized below.
On dry days, the odds of having low traffic flow group on the locations with fluctuant changing rate of traffic flow is higher than that on all the other locations; for the dry day moderate and high traffic flow groups, they have greater possibilities of occurring at the locations with stable changing rate of traffic flow. The same patterns happen in the wet weather condition. The larger the rainfall is, the higher the odds of having low traffic flow group occurring on the locations with fluctuant changing rate of traffic flow. The rainfall would also increase the odds of Saturdays having low traffic flow group, which is in line with the implication [46] that the increased frequency of precipitation events would decrease the number of trips on specific days with leisure purposes. Additionally, when the rainfall increases, although the possibilities of having moderate and high traffic flow group on locations with stable changing rate of traffic flow would decline, they still have a greater chance of appearing on these locations compared with other locations. Overall, the results above illustrate the rainfall's impact on the classified traffic flow group and traffic flow change pattern at each location spatially and temporally, and reveal how the traffic flow at each location changes over time on a dry/wet day.
Finally, the location-specific analysis depicts that the locations in the urban inner space, such as CBD, are more likely to be impacted by the inclement weather, and that the rainfall's impact on traffic flow in the urban inner region is bigger than in the urban outer area. Finally, the aggregate level analysis detects the impact of rainfall on traffic flow across different locations. Specifically, the rainfall positively impacts the total traffic flow.
To summarize, we have implemented various methodologies to comprehensively analyse the impact of rainfall on traffic flow both qualitatively and quantitatively, and at three levels (i.e., comprehensive, location-specific, and aggregate). The results from our analyses consistently show that both traffic flow and traffic flow change patterns are significantly affected by the inclement weather conditions (i.e., the traffic flow would increase on wet days, especially on weekday), and that the rainfall can induce the change of traffic flow temporally (on weekdays, Saturday, and Sunday and at various periods of each day) and spatially (at different locations in the transportation network).
By recognising the spatial-temporal impacts of rainfall on traffic flow, the locations and time periods that are significantly influenced by rainfall can be identified, which can be used by planners or researchers to more accurately model the impact of rainfall on transport infrastructures. Furthermore, since our analysis shows an increase in traffic flow in wet weather, this finding can have important implications on traffic congestion and road safety. Because in wet weather, drivers usually drive more cautiously and more slowly to avoid collision due to reduced friction between the vehicle's tires and the road surface, such behaviour means that traffic congestion is more likely to occur in wet weather. Some limitations in this paper will be further addressed in our future work. First, this paper only considers two months' traffic data in Brisbane. In the future, more data will be used to cover more diverse weather conditions. Moreover, public transit should be considered in future studies. Meanwhile, we have applied MLR and OLR to statistically model the spatial-temporal relation between weather conditions and traffic flow. In the future, data mining and deep learning methods could be considered.