Prediction of Vacant Parking Spaces in Multiple Parking Lots: A DWT-ConvGRU-BRC Model

: For cities, the problem of “difﬁcult parking and chaotic parking” increases carbon emissions and reduces quality of life. Accurately and efﬁciently predicting the availability of vacant parking spaces (VPSs) can help motorists reduce the time spent looking for a parking space and reduce greenhouse gas pollution. This paper proposes a deep learning model called DWT-ConvGRU-BRC to predict the future availability of VPSs in multiple parking lots. The model ﬁrst uses a discrete wavelet transform (DWT) to denoise the historical parking data and then extracts the temporal correlation of the parking lots themselves and the spatial correlation between different parking lots using a convolutional gated recurrent unit network (ConvGRU) while using a BN-ReLU-Conv (1 × 1) module to further improve the propagation and reuse of features in the prediction process. In addition, the model uses availability, temperature, humidity, wind speed, weekdays, and weekends as inputs to improve the accuracy of the forecasts. The model performance is evaluated through a case study of 11 parking lots in Santa Monica. The DWT-ConvGRU-BRC model outperforms the LSTM and GRU baseline methods, with an average testing MAPE of 2.12% when predicting multiple parking lot occupancies over the subsequent 60 min.


Introduction
With economic and population growth, motor vehicle ownership is growing rapidly, which has exacerbated the imbalance between the supply and demand of vacant parking spaces (VPSs) in cities. Drivers typically spend 3.5-14 min looking for a VPS and cruising to find a VPS accounts for 8-74% of traffic [1].Excessive time spent by drivers looking for VPSs increases time costs, fuel consumption, and emissions and leads to traffic congestion [2].Parking difficulties and disorderly parking problems are often affected by accessibility, parking prices, and the number of VPSs.
To address this problem, some parking management and inducement systems have been developed to provide real-time VPS information [3][4][5][6].These systems typically collect real-time available parking data using cameras and sensors [7,8].In addition, several crowd-sensing-based schemes monitor the availability of street parking using mobile communication devices and in-vehicle sensors [9,10].However, these parking guidance systems cannot guarantee the real-time nature of a VPS.That is, when a driver arrives at the designated parking space, the parking space may already be occupied.Due to the high cost of sensor equipment and maintaining real-time parking information, Y et al. [11] and Baidu Maps proposed a model named Du-parking to estimate the real-time parking availability of the whole city.Therefore, to enable car owners to purposefully find parking spaces, it is necessary to develop a parking guidance information system with predictive algorithms that can help drivers plan driving routes to find VPSs and reduce driving costs [12], which Appl.Sci.2023, 13, 3791 2 of 16 can assist traffic planning and management to reduce energy consumption and traffic congestion [2].
In this paper, we propose a DWT-ConvGRU-BRC model.This model consists of a discrete wavelet transform (DWT) [13], convolutional GRU networks (ConvGRUs) [14], a two-layer linear network, and a composite function of three consecutive operations, i.e., batch normalization (BN), rectified linear activation (ReLU), and a 1 × 1 convolution (Conv), denoted BRC.First, we use the DWT to denoise the VPS data.Noise reduction before forecasting can eliminate the volatility of the VPS data themselves.Then, we use a deep learning-based prediction model that leverages ConvGRUs and a two-layer linear network to incorporate the spatial-temporal features of multiple data sources acquired in networks.Finally, the propagation and reuse of features in the prediction process are further improved using the BRC composite function.
This paper contributes to the literature in the following ways: • We propose a deep learning-based parking space prediction model from the perspective of multiple parking lots.The model considers the processing of parking noise data as well as the spatial correlation of multiple parking lots and the temporal correlation of the parking lots themselves and uses a variety of factors, including parking lot occupancy, temperature, humidity, wind speed, weekdays, and holidays, to predict the number of available VPSs.

•
Our proposed DWT-ConvGRU-BRC model can simultaneously predict the number of available parking spaces in multiple parking lots.Specifically, a ConvGRU is used to capture the spatial-temporal features of multiple parking lots, a two-layer linear layer is used to extract external influences, and BRC is used to further improve the propagation and reuse of features in the prediction process.

•
The performance of the method is evaluated with a case study in the Santa Monica area.According to the results, the model outperforms other baseline methods, including LSTM, GRU, ConvGRU, and dConvLSTM-DCN models.Moreover, the results prove the improvement in prediction accuracy from the DWT and the effectiveness of incorporating weekday, vacation, and weather features into parking lot occupancy predictions.
The rest of this paper is organized as follows: Section 2 summarizes the literature review.Section 3 describes the detailed DWT-ConvGRU-BRC prediction model.Section 4 presents the results and analysis of the comparison experiments.Finally, we provide our conclusions and discuss possible future work in Section 5.

Literature Review
Access to VPS data has become easier with breakthroughs in sensor technology.However, the VPS data obtained in practical applications are often subject to different degrees of noise pollution.How to effectively process the data collected by sensors and improve the accuracy of algorithms is a thorny problem that many existing prediction methods still face.To solve this problem, wavelet analysis has been applied in some recent studies and has proven to be effective.For example, Li et al. [15] used the wavelet function for multiscale wavelet decomposition and reconstruction of VPS data using the hidden layer function of a wavelet neural network to improve prediction accuracy.Ji et al. [16] proposed a multistep prediction study of impacted parking spaces based on a WT in combination with a multistep prediction strategy using threshold noise reduction to further improve the prediction accuracy.Therefore, effective noise removal helps to improve the efficiency and accuracy of prediction.
Predicting the occupancy rate of multiple parking lots is one of the necessary links to solve the "difficult parking" problem.In recent years, VPS prediction has been divided into two categories: one is based on a statistical prediction model, and the other is based on machine learning (ML) and deep learning (DL).For statistical prediction models, Caliskan et al. [17] combined continuous Markov and queuing theory models to predict the occupancy status of parking lots in the destination area.On this basis, Xiao et al. [18] proposed a continuous-time Markov M\M\C\C model for predicting available parking spaces.Caicedo et al. [19] proposed a real-time available dynamic algorithm using historical information to predict the availability of each parking lot.In addition, Rajabioun et al. [20] developed a vector spatiotemporal autoregressive model that can be used to predict the availability of parking spaces at a driver's estimated arrival time at both on-street and offstreet parking locations.Peng et al. [21] modelled the discrete occupancy rate of a parking lot as a nonstationary Poisson process and proposed a cost-effective method for searching for parking spaces.Abdeen et al. [22] proposed a smart parking algorithm that varied the weights of five factors (availability, gate wait time, parking cost, traffic congestion, and driving distance to the parking lot) to achieve balanced traffic allocation and parking best use of the field.In fact, these statistical prediction models are highly dependent on assumptions about the arrival and departure process and therefore have difficulty adapting to the dramatic fluctuations in parking traffic flow.
For ML/DP prediction models, researchers have applied models such as regression trees, support vector machines (SVMs), support vector regression (SVR), neural networks, K-nearest neighbour (KNN), and random forests models to predict parking availability [23][24][25][26][27]. Hu et al. [28] combined support vector regression (SVR) and the fruit fly optimization algorithm (FOA) to predict the number of vacant parking spaces.Fan et al. [29] optimized a multi-step long short-term memory recurrent neural network (LSTM-NN) model with a grid search method to predict the number of vacant parking spaces.Moreover, many scholars have combined nonlinear system theory and optimization algorithms with neural networks to improve prediction accuracy.For example, Vlahogianni et al. [30] used a genetic algorithm-optimized multilayer perception (MLP) to predict the occupancy rate of a regional parking lot over the subsequent 30 min.Camero et al. [31] used a genetic algorithm (GA) combined with a recurrent neural network (RNN) to predict parking occupancy in Birmingham.Zeng et al. [32] combined a wavelet transform (WT) with bi-directional LSTM (Bi-LSTM) to further improve the prediction accuracy using threshold noise reduction.
In addition, some scholars have considered the influence of external factors, such as weather and holidays, on VPS forecasting.Fokker et al. [33] explored the influence of external factors such as weather on parking occupancy and found that external factors improved the predictive performance by 8%.In Zhang's [34] work, a PewLSTM was proposed for predicting parking behaviour by combining the effects of weather and parking periodicity.Zeng et al. [35] proposed a stacked gated recurrent unit (GRU)-LSTM model that combined the efficiency of a GRU and the accuracy of LSTM and incorporated various factors as inputs, such as weather, to predict the availability of parking spaces.ML/DL methods can automatically learn from past samples to better describe complex nonlinear problems.However, the ML/DP methods described above only consider the temporal correlation of VPS data and fail to consider the spatial correlation of VPSs in multiple parking lots.
Therefore, this paper proposes a model called DWT-ConvGRU-BRC to predict the number of VPSs in multiple parking lots.Our model combines the advantages of wavelet transform.While capturing the spatial-temporal correlation of multiple parking lot data, it also takes external factors such as weather as input to improve the accuracy of the model.Previous research related to our methodology includes DWT-Bi-LSTM [32] and dConvLSTM-DCN [36].

Data Description
To evaluate the performance of the proposed prediction model, we conducted a case study in Santa Monica, CA, USA (longitude range: [−118.499378,−188.49361],latitude range: [34.019575, 34.010806]) [37], which has 11 parking lots scattered over the road network, as illustrated in Figure 1.The data were collected from 6 April 2021 to 13 May 2021.The number of VPSs was collected every 5 min, resulting in 10,944 pieces of historical data per parking lot.
To evaluate the performance of the proposed prediction model, we conducted a case study in Santa Monica, CA, USA (longitude range: [−118.499378,−188.49361],latitude range: [34.019575, 34.010806]) [37], which has 11 parking lots scattered over the road network, as illustrated in Figure 1.The data were collected from 6 April 2021 to 13 May 2021.The number of VPSs was collected every 5 min, resulting in 10,944 pieces of historical data per parking lot.We use a 100 m × 100 m grid to divide the target area into H × W grids (Figure 1).Each parking lot in the region is distributed in a different grid, and a grid without a parking lot distribution is considered to have no VPSs in that grid.Then, the number of VPSs in the area at time t is denoted as: where each element in the matrix, denoted as ( , ) , and , is the number of VPSs in the grid (h, w).This area is divided into a total of 60 grids, with H being 10 and W being 6.
The 11 parking lots we selected in the grid area are the St1-St9 parking lots, Lot1 parking lot, and Library parking lot.These parking lots are mainly distributed in recreational, commercial, and residential areas.It is worth noting that there are similarities and differences in the evolution of the number of spaces in these parking lots.We can imagine that the closer the parking lot types and the closer the distance, the more they should have the characteristics of time-space correlation.We take the Lot1, St5, and St7 parking lots as examples to mine the characteristics of different parking lots from the perspective of spatiotemporal correlation, considering that the St5 and St7 parking lots represent commercial areas and are close to each other, and the Lot1 parking lot represents entertainment areas.We use a 100 m × 100 m grid to divide the target area into H × W grids (Figure 1).Each parking lot in the region is distributed in a different grid, and a grid without a parking lot distribution is considered to have no VPSs in that grid.Then, the number of VPSs in the area at time t is denoted as: where each element in the matrix, denoted as v t (h,w) , h ∈ [0, H], and w ∈ [0, W], is the number of VPSs in the grid (h, w).This area is divided into a total of 60 grids, with H being 10 and W being 6.
The 11 parking lots we selected in the grid area are the St1-St9 parking lots, Lot1 parking lot, and Library parking lot.These parking lots are mainly distributed in recreational, commercial, and residential areas.It is worth noting that there are similarities and differences in the evolution of the number of spaces in these parking lots.We can imagine that the closer the parking lot types and the closer the distance, the more they should have the characteristics of time-space correlation.We take the Lot1, St5, and St7 parking lots as examples to mine the characteristics of different parking lots from the perspective of spatiotemporal correlation, considering that the St5 and St7 parking lots represent commercial areas and are close to each other, and the Lot1 parking lot represents entertainment areas.
Figures 2 and 3 show the spatiotemporal characteristics of these 3 parking lots.The x-axis represents the time interval.The y-axis represents the change in VPSs, where a positive number represents the outflow of vehicles.The larger the number is, the greater the number of VPSs. Figure 2 shows that the inflow on weekends is significantly higher than that on weekdays during the almost full day in the Lot1 parking lot.The Lot1 parking lot represents an entertainment area, which tends to be crowded on weekends.In contrast, between 7 a.m. and 9 a.m., the inflow of the St5 and St7 parking lots is higher on weekdays than on weekends, which may be influenced by the parking of mall workers.In addition, we explore the impact of weather factors on the availability of VPSs.number of VPSs. Figure 2 shows that the inflow on weekends is significantly higher than that on weekdays during the almost full day in the Lot1 parking lot.The Lot1 parking lot represents an entertainment area, which tends to be crowded on weekends.In contrast, between 7 AM and 9 AM, the inflow of the St5 and St7 parking lots is higher on weekdays than on weekends, which may be influenced by the parking of mall workers.In addition, we explore the impact of weather factors on the availability of VPSs.The number of parking occupancies is different for hazardous weather and normal weather.It can be imagined that when encountering hazardous weather such as heavy rain, heavy snow, and smog, people may reduce travel in private cars, so the number of available parking spaces will increase.The results for the three representative parking lots are shown in Figure 3.For the entertainment area represented by Lot1 and the commercial areas represented by St5 and St7, a significant decrease in parking occupancy was observed for all hours of the day under hazardous weather conditions.To assess the impact of hazardous weather conditions on parking demand, we define weather to be considered hazardous if one or more of the following conditions are met: (1) fog or snow, (2) wind speed greater than 39 km/h, (3) precipitation intensity greater than 0.15 inches per hour.All other conditions are considered normal weather conditions.Similar to the research of Yang et al. [38] and Zhao et al. [39], we conduct ablation experiments in Section 4 to explore the influence of external factors such as weather on parking prediction.number of VPSs. Figure 2 shows that the inflow on weekends is significantly higher than that on weekdays during the almost full day in the Lot1 parking lot.The Lot1 parking lot represents an entertainment area, which tends to be crowded on weekends.In contrast, between 7 AM and 9 AM, the inflow of the St5 and St7 parking lots is higher on weekdays than on weekends, which may be influenced by the parking of mall workers.In addition, we explore the impact of weather factors on the availability of VPSs.The number of parking occupancies is different for hazardous weather and normal weather.It can be imagined that when encountering hazardous weather such as heavy rain, heavy snow, and smog, people may reduce travel in private cars, so the number of available parking spaces will increase.The results for the three representative parking lots are shown in Figure 3.For the entertainment area represented by Lot1 and the commercial areas represented by St5 and St7, a significant decrease in parking occupancy was observed for all hours of the day under hazardous weather conditions.To assess the impact of hazardous weather conditions on parking demand, we define weather to be considered hazardous if one or more of the following conditions are met: (1) fog or snow, (2) wind speed greater than 39 km/h, (3) precipitation intensity greater than 0.15 inches per hour.All other conditions are considered normal weather conditions.Similar to the research of Yang et al. [38] and Zhao et al. [39], we conduct ablation experiments in Section 4 to explore the influence of external factors such as weather on parking prediction.The number of parking occupancies is different for hazardous weather and normal weather.It can be imagined that when encountering hazardous weather such as heavy rain, heavy snow, and smog, people may reduce travel in private cars, so the number of available parking spaces will increase.The results for the three representative parking lots are shown in Figure 3.For the entertainment area represented by Lot1 and the commercial areas represented by St5 and St7, a significant decrease in parking occupancy was observed for all hours of the day under hazardous weather conditions.To assess the impact of hazardous weather conditions on parking demand, we define weather to be considered hazardous if one or more of the following conditions are met: (1) fog or snow, (2) wind speed greater than 39 km/h, (3) precipitation intensity greater than 0.15 inches per hour.All other conditions are considered normal weather conditions.Similar to the research of Yang et al. [38] and Zhao et al. [39], we conduct ablation experiments in Section 4 to explore the influence of external factors such as weather on parking prediction.

Prediction Model
The DWT-ConvGRU-BRC model provided in this study consists of four components, namely, the DWT component, three ConvGRU components, the meta-info feature extraction component, and the BRC component (Figure 4).The first component is the DWT module, which performs noise reduction on VPS data by means of the db3 wavelet basis function.The second component is the ConvGRU module.A CNN can capture spatial correlation well but not temporal correlation.A GRU and LSTM can both model temporal correlation well, but a GRU maintains the prediction accuracy and reduces the running speed compared to LSTM.Therefore, an integration of a CNN and GRU to form a three-layer ConvGRU network can capture both temporal and spatial correlations.The third component is a two-layer linear layer module that incorporates external factors such as temperature, wind speed, humidity, weekdays, and vacations into the model to enhance the accuracy of long-term forecasts.Finally, feature fusion is performed using the BRC layer to obtain predictions via the sigmoid function.
tion.The second component is the ConvGRU module.A CNN can capture spatial cor lation well but not temporal correlation.A GRU and LSTM can both model temporal c relation well, but a GRU maintains the prediction accuracy and reduces the running spe compared to LSTM.Therefore, an integration of a CNN and GRU to form a three-lay ConvGRU network can capture both temporal and spatial correlations.The third comp nent is a two-layer linear layer module that incorporates external factors such as temp ature, wind speed, humidity, weekdays, and vacations into the model to enhance the curacy of long-term forecasts.Finally, feature fusion is performed using the BRC layer obtain predictions via the sigmoid function.

Discrete Wavelet Transform (DWT) Denoising
Time series data obtained in practical applications are often contaminated by vario forms and degrees of noise.The discrete wavelet transform is very appropriate for no filtering, which makes it a good choice for time series data processing [40].When tim series data are decomposed by a DWT, the original signal is separated into approxim coefficients and detail coefficients at different resolution levels.The information of t original signal is retained in the wavelet coefficients, and a perfect reconstruction of t original data can be performed from these coefficients.However, some of the detail co ficients that represent the detailed motion in the data can be identified as noise.The coefficients can then be set to zero prior to the DWT reconstruction process to filter o the noise from the original time series, and reconstruction involves reconstructing the tim series from every component except the noise.In other words, a DWT is a discretizati of the scales and translations of the fundamental wavelet.A DWT can be defined as: where ψ is the complex conjugate of ψ , formula ψ satisfies , a m and n are integers.

Discrete Wavelet Transform (DWT) Denoising
Time series data obtained in practical applications are often contaminated by various forms and degrees of noise.The discrete wavelet transform is very appropriate for noise filtering, which makes it a good choice for time series data processing [40].When time series data are decomposed by a DWT, the original signal is separated into approximate coefficients and detail coefficients at different resolution levels.The information of the original signal is retained in the wavelet coefficients, and a perfect reconstruction of the original data can be performed from these coefficients.However, some of the detail coefficients that represent the detailed motion in the data can be identified as noise.These coefficients can then be set to zero prior to the DWT reconstruction process to filter out the noise from the original time series, and reconstruction involves reconstructing the time series from every component except the noise.In other words, a DWT is a discretization of the scales and translations of the fundamental wavelet.A DWT can be defined as: where ψ is the complex conjugate of ψ, formula ψ satisfies +∞ −∞ ψ(t)dt = 0, and m and n are integers.
Appropriate wavelet basis functions are very important to extract the features of parking data.Kaplun et al. [41] selected the appropriate wavelet basis function based on entropy estimation in the matching pursuit algorithm.Bhavsar et al. [42] chose the appropriate wavelet basis function by calculating the magnitude of the mutual information.In this paper, we measure the dependence between two variables by calculating the normalized Appl.Sci.2023, 13, 3791 7 of 16 mutual information (NMI), which places the mutual information between [0, 1], and it is easy to choose a suitable wavelet basis function.The NMI is defined as: where H(X) and H(Y) represent the entropy of variables X and Y, respectively.H(X|Y) is the conditional entropy for X given Y.
In this study, the basisfunctions of the compared wavelets are Daubechies (db3), symlet (sym3), and coiflet (coif3).It can be seen from Figure 5 that db3 has the highest NMI relative to other wavelet functions.After experimental comparison, the db3 wavelet basis is selected for wavelet decomposition of the experimental time series, and the number of decomposition levels is 3, which can remove the noise while maintaining the fluctuation characteristics of the time series data as much as possible.
it is easy to choose a suitable wavelet basis function.The NMI is In this study, the basisfunctions of the compared wavelets are let (sym3), and coiflet (coif3).It can be seen from Figure 5 that d relative to other wavelet functions.After experimental compariso is selected for wavelet decomposition of the experimental time s decomposition levels is 3, which can remove the noise while ma characteristics of the time series data as much as possible.We take the St7 parking lot as an example to show the spec component.The original VPS data and its decomposition compon 6. Red is the high-frequency sequence decomposed three times, quency sequence.The denoising was performed using the thresho denoised time series was reconstructed. Figure 7 shows the comp inal and denoised data.We can see that the overall regularity o increased, and the trend tends to be smoother, which is more suit tion.As in Equation ( 1), the denoised data are X t .We take the St7 parking lot as an example to show the specific process of the DWT component.The original VPS data and its decomposition components are shown in Figure 6.Red is the high-frequency sequence decomposed three times, and blue is the low-frequency sequence.The denoising was performed using the threshold method, and then the denoised time series was reconstructed. Figure 7 shows the comparison between the original and denoised data.We can see that the overall regularity of the denoised data has increased, and the trend tends to be smoother, which is more suitable for model construction.As in Equation ( 1), the denoised data are X t .

Convolutional Gated Recurrent Unit (ConvGRU)
A key difference between a ConvGRU and GRU is that the former uses a convolution operator rather than a fully concatenated operator.Therefore, a ConvGRU can better capture spatial-temporal correlations.Figure 8 shows the internal structure of the ConvGRU Appl.Sci.2023, 13, 3791 9 of 16 cell, where CAT and UCAT denote the concatenation and splitting operations, respectively.The detailed information flow of a ConvGRU is shown in the following equations: where z t denotes the update gate, r t denotes the reset gate, and ∧ h t denotes the candidate hidden state.x t denotes the input of the current step and h t and h t−1 denote the hidden state of the current and previous steps, respectively.x t and h t−1 are the input and output vectors of the current time point, respectively, and W .z , W .r , and W .h are the convolutional kernels for each gate.In addition, σ and tanh are the sigmoid and hyperbolic tangent activation functions, respectively, * is the convolution operation, and denotes the Hadamard product.
ture spatial-temporal correlations.Figure 8 shows the internal structure of the ConvGRU cell, where CAT and UCAT denote the concatenation and splitting operations, respectively.The detailed information flow of a ConvGRU is shown in the following equations:   Specifically, given t X (the denoised data) and 1 t h − as inputs at time step t, the unit first obtains the output of the update gate and reset gates, respectively, using Equations ( 4) and ( 5).Then, using Equation ( 6), the temporal hidden state t h ∧ can be calculated, which considers both the input Specifically, given X t (the denoised data) and h t−1 as inputs at time step t, the unit first obtains the output of the update gate and reset gates, respectively, using Equations ( 4) and ( 5).Then, using Equation ( 6), the temporal hidden state ∧ h t can be calculated, which considers both the input X t and the hidden state h t−1 produced by the previous ConvGRU operator.The final hidden state h t of the unit is produced by a linear combination of temporal hidden state ∧ h t and previous hidden state h t−1 using Equation (7).We denote the output of this part as follows: where m denotes the number of data divided into 5 min intervals and ⊕ is the concatenation operation.

External Factor Extraction and Feature Learning
Parking spaces have obvious periodic characteristics.We analysed the impact of external factors, such as weekdays, non-weekdays, hazardous weather, and normal weather, on the availability of parking spaces.External factors can affect parking events and are an important part of the model.A two-layer linear layer was designed to consider the impact of external factors on VPSs.We record the data collected from [43] on temperature, wind speed, humidity, etc., and weekdays and non-weekdays as inputs to this section as n.Bringing n into Equation ( 9), we obtain O l ∈ R HW×1 : where σ is the activation function and w n,i , b n,i , i = 1, 2 are the weights and deviations of the i-th linear function.For feature learning, we convert the resulting output O l ∈ R HW×1 to O l ∈ R H×W via the Reshape function.
The outputs of the three ConvGRUs and the additional factor extraction component are concatenated together, denoted as O i = O c ⊕ O l , and fed into the BRC layer.The BRC layer is a composite function of three consecutive operations, i.e., batch normalization (BN), rectified linear activation (ReLU), and a 1 × 1 convolution (Conv).We use the BRC layer to implement feature reuse and propagation.Finally, the prediction is obtained by applying the sigmoid function.

Experimental Setup and Evaluation Indicators
In our experiments, we select 60% of the data as the training set, 20% as the validation set, and the rest as the test set.We normalize the denoised data using Equation ( 10) and then slice it into the model using single-step moving window data of length 10.For training, the gradient descent optimization algorithm is the Adam [44] algorithm, the learning rate is 0.01, the loss function is the MAE, the epoch size is 32, the batch size is 32, and the number of ConvGRU layers (k) is 3.The numbers of VPSs after 5, 15, 30, 45, and 60 min are predicted accordingly.To avoid contingency, each prediction task was independently repeated 30 times, and the mean values were taken as the results.The DWT-ConvGRU-BRC model is implemented using PyTorch version 1.11.0, and the experimental equipment includes a 12th Gen lntel(R) Core(TM) i5-12600KF processor and an NVIDIA GeForce GTX 30600Ti GPU with 16 GB memory.
x * = x − min max − min (10) where max and min represent the maximum and minimum values of the sample data, respectively, and max-min represents the range.We use the mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) to measure the accuracy of the predicted values.All three evaluation metrics have a range of [0, +∞) and are equal to 0 when the predicted value exactly matches the true value, i.e., a perfect model; the larger the error is, the larger the value.The definitions are shown in Equations ( 11)- (13).
where y i denotes the actual VPSs, ∧ y i denotes the predicted VPSs, and n denotes the time step.

Results and Analysis
Fifty experiments were carried out, and the mean was selected as the result to improve the statistical significance of the difference in precision.Table 1 shows the detailed RMSE, MAE, and MAPE data for our DWT-ConvGRU-BRC model.Figure 9 shows the effect of the DWT-ConvGRU-BRC model in the Lot1, St5, and St7 parking lots for 5, 15, 30, 45, and 60 min VPS predictions, where the x-axis represents the time interval, and the y-axis represents the number of VPSs.Our model is robust in predicting the availability of VPSs in multiple parking lots.From Section 3, we conclude that parking occupancy is affected by external factors such as weather.This is consistent with the conclusions drawn in [33][34][35]38,39].Therefore, we performed ablation experiments in Table 1 to compare the effect of the presence or absence of external factors on VPS prediction.We find that the inclusion of external factors such as temperature, wind speed, weekdays, and weekends was beneficial in improving the accuracy of the forecasts.For the RMSE, MAE, and MAPE, the forecasting models with external factors outperformed those without external factors in 40, 35, and 39 of the 55 forecasting tasks, respectively.This shows that considering external factors can improve the accuracy of VPS prediction, so it can be said that external factors have a significant impact on parking prediction.Figure 10

Results and Analysis
Fifty experiments were carried out, and the mean was selected as the result to improve the statistical significance of the difference in precision.Table 1 shows the detailed RMSE, MAE, and MAPE data for our DWT-ConvGRU-BRC model.Figure 9 shows the effect of the DWT-ConvGRU-BRC model in the Lot1, St5, and St7 parking lots for 5, 15, 30, 45, and 60 min VPS predictions, where the x-axis represents the time interval, and the y-axis represents the number of VPSs.Our model is robust in predicting the availability of VPSs in multiple parking lots.As some models can only predict for a single parking lot, we compared the effectiveness of these models in predicting the number of VPSs at 5, 15, 30, 45, and 60 min, using the St7 parking lot as an example.The LSTM and GRU models can effectively extract temporal information from nonlinear time series data, but they fail to consider the spatial correlation between parking lots within a region.The ConvGRU model outperforms the ConvLSTM model in terms of running speed while capturing spatial-temporal correlations.After wavelet noise reduction, the forecasts improved significantly, and external factors, such as weather, improved the accuracy of the long-term forecasts.As illustrated in Table 2, the proposed DWT-ConvGRU-BRC model is significantly superior to the benchmark methods.As some models can only predict for a single parking lot, we compared the effectiveness of these models in predicting the number of VPSs at 5, 15, 30, 45, and 60 min, using the St7 parking lot as an example.The LSTM and GRU models can effectively extract temporal information from nonlinear time series data, but they fail to consider the spatial correlation between parking lots within a region.The ConvGRU model outperforms the ConvLSTM model in terms of running speed while capturing spatial-temporal correlations.After wavelet noise reduction, the forecasts improved significantly, and external factors, such as weather, improved the accuracy of the long-term forecasts.As illustrated in Table 2, the proposed DWT-ConvGRU-BRC model is significantly superior to the benchmark methods.To illustrate the ability of the proposed model to predict the actual number of VPSs, in Table 3, we present a detailed comparison between the actual and predicted number of St7 parking lot VPSs (from 10:00 to 11:00, 6 May 2021) output by the proposed DWT-ConvGRU-BRC model.We also calculate the MAE, MAPE, and RMSE values for the time period.The output values of the DWT-ConvGRU-BRC model are very close to the real values.In addition, model performance evaluation should consider both prediction accuracy and time consumption.Table 4 compares the running times per round of the LSTM, GRU, ConvGRU, ConvLSTM, dConvLSTM-DN, DWT-ConvGRU, and DWT-ConvGRU-BRC models.The DWT-ConvGRU-BRC model considers the effects of factors such as wavelet noise reduction and external factors, so it is slightly inferior to the models that do not consider these factors in terms of running speed, but compared to the dConvLSTM-DN model proposed in [36], our model shows a significant improvement in running speed.In conclusion, our proposed model not only improves the effectiveness of predictions but also improves the running speed.

Conclusions
This paper proposes a deep learning model for occupancy prediction of multiple parking lots.The model incorporates DWT, ConvGRU, and BRC modules and has the flexibility to take multiple spatial-temporal structured data sources as inputs.The performance of the model is evaluated using a case study from 11 public parking lots in Santa Monica, California, USA, in which VPS data, weather data, and weekday and weekend data are used.The experimental results show that our model can achieve considerably high accuracy with MAPEs of less than 2% for short-term predictions and less than 4% for long-term predictions.The DWT-ConvGRU-BRC model significantly outperforms the baseline LSTM and GRU methods.In general, we found that noise reduction of VPS data using a DWT can improve prediction accuracy and that combining weather information and weekday and weekend information can improve the performance of long-term predictions of parking occupancy.
Prediction of available parking spaces is an integral part of parking guidance information systems.Available parking space predictions can improve the effectiveness of parking guidance system information, which can help drivers plan driving routes and find vacant parking spaces.Furthermore, if we have a reliable parking prediction algorithm, we can apply dynamic parking pricing to control the parking demand of each parking lot, thereby assisting traffic planning and management and reducing energy consumption and traffic congestion.In future work, we will concentrate on further improving the adaptability by considering other external influences, such as POI information, traffic incident data, traffic flow data, etc.At the same time, future research will also consider how the running time of the model can be optimized while ensuring prediction accuracy.

Figure 1 .
Figure 1.Distribution of parking lots in the region.

Figure 1 .
Figure 1.Distribution of parking lots in the region.

Figure 5 .
Figure 5. Wavelet selection based on normalized mutual information.

Figure 5 .
Figure 5. Wavelet selection based on normalized mutual information.

t x and 1 th
− are the in- put and output vectors of the current time point, respectively, and .zW , .rW , and .hW are the convolutional kernels for each gate.In addition, σ and tanh are the sigmoid and hyperbolic tangent activation functions, respectively, * is the convolution operation, and  denotes the Hadamard product.

Figure 8 .
Figure 8. Internal structure of the ConvGRU cell.

t X and the hidden state 1 thFigure 8 .
Figure 8. Internal structure of the ConvGRU cell.
specifically shows the prediction effect of our proposed DWT-ConvGRU-BRC model for each parking lot.It can be seen from the figure that our model has good timeliness in predicting the number of VPSs in multiple parking lots, and it is relatively robust to changes in the spatial-temporal correlation of VPSs and can accurately achieve predictions.where i y denotes the actual VPSs, i y ∧ denotes the predicted VPSs, and n denotes the time step.
H Y is the conditional entropy for X given Y .

Table 1 .
Performance evaluation in terms of the MAE, RMSE, and MAPE.

Table 1 .
Performance evaluation in terms of the MAE, RMSE, and MAPE. DWT-

Table 2 .
Comparison of operating results.

Table 2 .
Comparison of operating results.

Table 3 .
Comparisons between the real and predicted numbers of VPSs at the St7 parking lot.

Table 4 .
GPU runtimes of different prediction methods.