Dataset: Mobility Patterns of a Coastal Area Using Trafﬁc Classiﬁcation Radars

: Monitoring road trafﬁc is extremely important given the possibilities it opens up in terms of studying the behavior of road users, road design and planning problems, as well as because it can be used to predict future trafﬁc. Especially on highways that connect beaches and larger urban areas, trafﬁc is characterized by having peaks that are highly dependent on weather conditions and rest periods. This paper describes a dataset of mobility patterns of a coastal area in Aveiro region, Portugal, fully covered with trafﬁc classiﬁcation radars, over a two-year period. The sensing infrastructure was deployed in the scope of the PASMO project, an open living lab for co-operative intelligent transportation systems. The data gathered includes the speed of the detected objects, their position, and their type (heavy vehicle, light vehicle, two-wheeler, and pedestrian). The dataset includes 74,305 records, corresponding to the aggregation of road information at 10 min intervals. A brief analysis of the dataset shows the highly dynamic nature of trafﬁc during the two-year period. In addition, the existence of meteorological records from nearby stations, and the recording of daily data on COVID-19 infections, make it possible to cross-reference information and study the inﬂuence of weather conditions and infections on trafﬁc behavior.


Introduction
Co-operative Intelligent Transportation Systems (C-ITS) technologies have been attracting much R&D efforts during the last decades and are becoming more pervasive in the context of Smart Roads, not only due to the deployment of connected roadside infrastructures, but especially for their potential to provide information that can help the driver and road users. C-ITS are equipped with traffic sensors and communications platforms able to extract useful information from the road vehicles and transmit it to nearby vehicles and/or to traffic management centers.
The connected and sensor-equipped roadside infrastructures can assist real-time applications, such as route planning, traffic analysis, and statistics collection applications for highway management or other third parties' purposes. As an example, a highway traffic radar can support autonomous vehicles' maneuvers, but it can also count the number of Data 2022, 7, 97 2 of 11 vehicles entering the highway and monitor their behavior, and therefore, it can be used to assist traffic control prediction and congestion control decisions.
Traffic congestion is one of the main challenges faced today by drivers, motorway operators, and city managers, because it makes daily travel more complex, with negative impacts on the environment, time, and monetary costs for the users. Such a problem can be mitigated by providing to the driver traffic flow predictions, such as the probability of traffic congestion, helping them to avoid undesirable events by rerouting their travels, choosing another means of transport, or changing their trip times. On the other hand, traffic flow predictions can be beneficial for road and city operators in the implementation of traffic planning and management strategies.
The platform used to gather monitoring data was developed under the scope of the PASMO project-An Open Platform for the development and experimentation of Mobility Solutions- [1], providing connectivity to the project sensors and the usage of telecommunications services. This project was developed by the Institute of Telecommunications at the University of Aveiro, with the purpose of providing solutions for intelligent mobility problems and an open platform for researchers to develop their own ideas. The project platform [2] includes a set of services related to road mobility that use a set of physical sensors (e.g., traffic classification radar, parking, meteorological, etc.) and communication networks to monitor traffic behavior. The devices composing the core of the platform were installed in the municipalities of Ílhavo and Aveiro, in Portugal, at strategic locations to capture the most valuable information about the environment around the beaches of Barra and Costa Nova, as the map area, illustrated at an approximated scale of 1/10,000, in Figure 1 demonstrates. The main local city is 10 Km away toward the right of the map. Both covered places are both summer resorts, where people go to the beach, and places of residence, that act as urban dormitories for people that work in cities nearby. Vacationers and residents have different travel patterns, and their impact on the overall traffic differs from day to day and it varies depending on the time of year. traffic radar can support autonomous vehicles' maneuvers, but it can also count the number of vehicles entering the highway and monitor their behavior, and therefore, it can be used to assist traffic control prediction and congestion control decisions. Traffic congestion is one of the main challenges faced today by drivers, motorway operators, and city managers, because it makes daily travel more complex, with negative impacts on the environment, time, and monetary costs for the users. Such a problem can be mitigated by providing to the driver traffic flow predictions, such as the probability of traffic congestion, helping them to avoid undesirable events by rerouting their travels, choosing another means of transport, or changing their trip times. On the other hand, traffic flow predictions can be beneficial for road and city operators in the implementation of traffic planning and management strategies.
The platform used to gather monitoring data was developed under the scope of the PASMO project-An Open Platform for the development and experimentation of Mobility Solutions- [1], providing connectivity to the project sensors and the usage of telecommunications services. This project was developed by the Institute of Telecommunications at the University of Aveiro, with the purpose of providing solutions for intelligent mobility problems and an open platform for researchers to develop their own ideas. The project platform [2] includes a set of services related to road mobility that use a set of physical sensors (e.g., traffic classification radar, parking, meteorological, etc.) and communication networks to monitor traffic behavior. The devices composing the core of the platform were installed in the municipalities of Ílhavo and Aveiro, in Portugal, at strategic locations to capture the most valuable information about the environment around the beaches of Barra and Costa Nova, as the map area, illustrated at an approximated scale of 1/10,000, in Figure 1 demonstrates. The main local city is 10 Km away toward the right of the map. Both covered places are both summer resorts, where people go to the beach, and places of residence, that act as urban dormitories for people that work in cities nearby. Vacationers and residents have different travel patterns, and their impact on the overall traffic differs from day to day and it varies depending on the time of year.   Although there is a vast amount of work on data analysis and traffic forecasting studies published, namely, [4][5][6][7][8][9][10][11][12][13], datasets are not in the public domain. Publicly available dataset repositories such as OpenDataMonitor, Kaggle, and MDPI Data, allow one to find several datasets related with traffic accidents, but none representing highway traffic, especially Portuguese traffic, behavior. A search over the IEEE DataPort repository returned two references [14,15], published by Cruz et al., related to WiFI data [14] and V2V communications [15] collected by driving vehicles in Porto city. There is not, at least to our best knowledge, a dataset that describes the traffic flow in a shoreline area that makes it possible to correlate the habits of travel to the beach with the weather information. This makes the present dataset unique.
From the set of devices, we selected three traffic classification radars [16][17][18] covering all entries of the beaches of Barra and Costa Nova, whose data were collected from 2019 to 2021. This paper details the corresponding dataset, aggregating traffic data with a time granularity of 10 min, detailing the number of detected vehicles, including 2-wheelers, and their maximum, minimum, and median speed, in both directions. The goal is to enable further and better work by researchers. We also believe that the acquisition and processing methods can be helpful for others responsible for acquiring, cleaning, and processing IoT data from smart cities.
The paper continues in Section 2 with a description of the dataset and detailing the data collection and preparation methodologies in Section 3. Section 4 concludes the paper, and some additional information is provided in the annexes.

Data Description
The original telemetry dataset contains more than 170 million records (170,158,409) considering the years of 2019, 2020, and 2021, and is composed of parking sensors and radars data. Section 3 presents the steps to transform the original data, including granularity and speed calculations related to radar stations, since the parking sensor data were not included in the present dataset.
Each record stores the minimal, maximal, and mean speed of the objects approximating and detaching for each radar station. Furthermore, the traffic flow (TF) values for the two regions (Barra and Costa Nova) are presented by the TF_barra and TF_Costa attributes. The units of Speed and Traffic Flow are meters per second (m/s) and the number of objects per ten minutes (#obj/10 min), respectively.
TF_Barra and TF_Costa can be positive or negative. A positive TF value represents a movement of the increase in the traffic flow for that region, and negative values express a movement of reduction. Figure 2 presents the dataset; the x-axis represents the time and the y-axis the traffic flow values (Barra and Costa Nova).   The detailed statistics of the dataset are presented in Table 1       The detailed statistics of the dataset are presented in Table 1. Mean, Std (Standard Deviation), Min (Minimal), 25% (quartile 1), 50% (quartile 2), 75% (quartile 3), and Max (Maximal) are related to values of Traffic Flow for Barra and Costa Nova, and complete the description of the final dataset.  Figure 4 shows cumulative traffic count over the course of an entire day that, as explained in Section 3, is calculated from a cumulative sum of the difference of vehicles entering and exiting, shifting the values to make the minimum of zero. The values, by themselves, do not provide much information, since it is not possible to know exactly how many cars were in each place. Although the values are not precise about the number of vehicles in the system, Figure 4 allows one to understand the count evolution over the day, and, for instance, to notice that the lowest number of vehicles in Barra was in the morning, while the highest was in the afternoon. Furthermore, when comparing the values, we can observe that the number of vehicles entering and leaving Barra goes from 476 to 0, and since these values were taken in the morning, we can conclude the data represent people leaving home to go to work. If we compare it with Costa Nova, we can see the discrepancy in values for the same period is not so large; it falls from 101 to 6.  Figure 4 shows cumulative traffic count over the course of an entire day that, as explained in Section 3, is calculated from a cumulative sum of the difference of vehicles entering and exiting, shifting the values to make the minimum of zero. The values, by themselves, do not provide much information, since it is not possible to know exactly how many cars were in each place. Although the values are not precise about the number of vehicles in the system, Figure 4 allows one to understand the count evolution over the day, and, for instance, to notice that the lowest number of vehicles in Barra was in the morning, while the highest was in the afternoon. Furthermore, when comparing the values, we can observe that the number of vehicles entering and leaving Barra goes from 476 to 0, and since these values were taken in the morning, we can conclude the data represent people leaving home to go to work. If we compare it with Costa Nova, we can see the discrepancy in values for the same period is not so large; it falls from 101 to 6.

Data Preparation Methods
The radar used in PASMO was a UMRR-0C Type 42, produced by smartmicro, that operates in the 24Ghz band for multilane, multiobject traffic tracking and is capable of measuring several parameters (such as range, angle, and radial speed) of moving targets. It has a bandwidth of 250 Mhz and a maximum transmitted power of 20 dBm, and it uses a multiple Frequency Modulated Continuous Wave (FMCW) technique to acquire the relative speed and range of each target. The Type 42 integrated array of antennas allows for long-range and wide horizontal coverage. It also integrates tracking algorithms that can track up to 126 moving targets simultaneously, regardless of object speed, distance to the sensor, or azimuth angle. Table 2 summarizes the main characteristics of the radar used, and Correia [19] further details data sensorization and the gathering process.

Data Preparation Methods
The radar used in PASMO was a UMRR-0C Type 42, produced by smartmicro, that operates in the 24Ghz band for multilane, multiobject traffic tracking and is capable of measuring several parameters (such as range, angle, and radial speed) of moving targets. It has a bandwidth of 250 Mhz and a maximum transmitted power of 20 dBm, and it uses a multiple Frequency Modulated Continuous Wave (FMCW) technique to acquire the relative speed and range of each target. The Type 42 integrated array of antennas allows for long-range and wide horizontal coverage. It also integrates tracking algorithms that can track up to 126 moving targets simultaneously, regardless of object speed, distance to the sensor, or azimuth angle. Table 2 summarizes the main characteristics of the radar used, and Correia [19] further details data sensorization and the gathering process. The original telemetry dataset contains more than 170 million records (170,158,409) over 3 years, and is composed of parking sensors and radar data. The dataset is a sample from the entire PASMO platform data, according to a specific method. Figure 5 presents the complete method designed to prepare the final dataset, starting with the data selection applied to Parking and Radar data. For the goal of this dataset, just radar data were selected, resulting in 155,432,185 records. The original telemetry dataset contains more than 170 million records (170,158,409) over 3 years, and is composed of parking sensors and radar data. The dataset is a sample from the entire PASMO platform data, according to a specific method. Figure 5 presents the complete method designed to prepare the final dataset, starting with the data selection applied to Parking and Radar data. For the goal of this dataset, just radar data were selected, resulting in 155,432,185 records.  Table 3 presents the attributes of the original data. Each record is produced at a sampling interval of 100 milliseconds and contains the identification of the moving object, id, and co-ordinates of the radar, timestamp, and x-y axis speed component. The dataset was aggregated considering a ten-minute granularity.

Attribute
Content id object id timestamp record timestamp radar_lat latitude-radar co-ordinate radar_lon longitude-radar co-ordinate xSpeed x-axis speed component ySpeed y-axis speed component  Table 3 presents the attributes of the original data. Each record is produced at a sampling interval of 100 milliseconds and contains the identification of the moving object, id, and co-ordinates of the radar, timestamp, and x-y axis speed component. The dataset was aggregated considering a ten-minute granularity. Before adjusting the granularity, other derived radar data were produced; year, month, day, hour, weekday, and minute attributes were calculated from the timestamp. Furthermore, xSpeed and ySpeed attributes also result in the Speed measure. Negative values for Speed represent the measure of the speed of an object approximating the radar, and positive values represent the movement of the detachment; the in_out logical attribute stores this situation. These steps are represented in Figure 5.
Using the identification, speed, and direction of the moving object, it is possible to compute the number of vehicles that have passed, as well as their speed (maximal, mean, and minimal) at the radar level by year, month, day, hour, and minutes (ten-minute intervals). Table 4 presents the resulting format of the processing radar data, and each record represents measures aggregated for ten minutes at the hour.  Figure 6 depicts the localization of the radars. The first one is before entering the bridge, the second one is in the interconnection segment between Barra and Costa Nova, and the third one is at the urban limit to the south of Costa Nova.
Using the identification, speed, and direction of the moving object, it is possible to compute the number of vehicles that have passed, as well as their speed (maximal, mean, and minimal) at the radar level by year, month, day, hour, and minutes (ten-minute intervals). Table 4 presents the resulting format of the processing radar data, and each record represents measures aggregated for ten minutes at the hour.  Figure 6 depicts the localization of the radars. The first one is before entering the bridge, the second one is in the interconnection segment between Barra and Costa Nova, and the third one is at the urban limit to the south of Costa Nova. Figure 6. Radar relative velocities [3] representation. Figure 6. Radar relative velocities [3] representation. This work presents a dataset to store the traffic flow in Barra and Costa Nova. Therefore, two new measures were computed to represent the traffic flow in regions: TF_Barra and TF_Costa.
where QAR i = quantity of objects approximating the radar i, and QDR i = quantity of objects detaching from the radar i, computed by with i as the identification of the radar, j as the interval minute (0, 10, . . . , 50), obj_count_A as value of the count_obj attribute where the in_out = 1 and obj_count_D as value of the count_obj attribute where the in_out = 0. Therefore, TF_Barra and TF_Costa can be positive or negative values. A positive TF value represents a movement of the increase in the traffic flow for that region, and negative values can express a movement of reduction. Additionally, new speed measures were computed considering the movement of approximation and detaching for each radar.
with n as the quantity of records representing the Speed_med attribute with in_out = 1.
with i as the identification of the radar, j as the interval minute, and n as the quantity of records representing the Speed_med attribute with in_out = 0.
Speed_min_DR i,j = min Speed_min i,j In Equations (5)-(10), i represents the identification of the radar and j the interval minute. Equations (7) and (9) consider the records with in_out = 1, while Equations (8) and (10) consider those with in_out = 0. Table 5 presents the final processed radar data.

Conclusions
The present dataset was obtained through the aggregation of telemetry data from the PASMO project, and it was cleaned and aggregated in order to obtain a summary of the passages at 10 min intervals. The aggregation process produced 155,432,185 records that allow us to characterize the traffic in the beach area in the years of 2019, 2020, and 2021. In addition to allowing the analysis of traffic distribution and the forecast of future traffic, the present dataset can be correlated with national/regional holidays (Table A2 in Appendix B) and a set of events related to the COVID-19 pandemic [20] and the circulation restrictions that occurred in 2020 and 2021 (Table A3 in Appendix C). Furthermore, it can be correlated with meteorological data through the use of an adequate dataset, since in the case of bathing areas, the meteorological factor has a strong impact on road travel. This dataset can be used for a wide spectrum of situations related to smart cities and vehicular traffic. We highlight the identification of behaviors, trends, diurnal mobility patterns, and training of modelling and predictive algorithms.
In the near future, we plan to clean and aggregate the data produced by the remaining radars installed within the scope of the PASMO project and try to create a dataset that will allow us to characterize the pattern of entrances and exits in the city of Aveiro.

Acknowledgments:
The authors would like to acknowledge all the work carried out within the scope of PASMO project (https://www.pasmo.pt (accessed on 23 June 2022)) which enabled the deployment of the roadside infrastucture setup responsible for producing the available dataset. This past project, that ran between 15/05/2017 and 14/05/2020, was supported by the European Regional Development Fund (FEDER), through the Regional Operational Programme of Centre (CENTRO 2020) of the Portugal 2020 framework [Project PASMO with Nr. 000008 (CENTRO-01-0246-FEDER-000008)].

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix B
There are a number of dates that can have a special impact on traffic, as they are national or regional holidays and therefore allow people to go to the beaches, and there are festive dates in both villages (Barra and Costa Nova) which, because of the festivities, lead people to travel to places. Table A2 lists the festive dates and the list of holidays from the years 2019 to 2021.

Appendix C
There are two factors with an impact on traffic accessing the beaches: the evolution in the number of COVID-19 infections [4], and a set of dates on which, also due to the epidemic, a curfew was decreed, and circulation between municipalities was prohibited. Table A3 summarizes the set of dates related to driving bans. 18 May 2020 Opening of restaurants and cafes and beginning of face-to-face classes in the 11th and 12th grade. Day care centers were also permitted to start opening on that date. 30 July 2020 Opening of bars and nightclubs. 15 September 2020 Back to school and face-to-face work. 29 October 2020 to 2 November 2020 Ban on movement between municipalities on the holiday of the deceased.

October 2020
Mandatory curfew between 23:00 and 05:00 in the 121 most-affected municipalities. On weekends, the curfew in these municipalities starts at 1:00 pm.