Estimating Freeway Level-of-Service Using Crowdsourced Data

: In trafﬁc operations, the aim of transportation agencies and researchers is typically to reduce congestion and improve safety. To attain these goals, agencies need continuous and accurate information about the trafﬁc situation. Level-of-Service (LOS) is a beneﬁcial index of trafﬁc operations used to monitor freeways. The Highway Capacity Manual (HCM) provides analytical methods to assess LOS based on trafﬁc density and highway characteristics. Generally, obtaining reliable density data on every road in large networks using traditional ﬁxed location sensors and cameras is expensive and otherwise unrealistic. Traditional intelligent transportation system facilities are typically limited to major urban areas in different states. Crowdsourced data are an emerging, low-cost solution that can potentially improve safety and operations. This study incorporates crowdsourced data provided by Waze to propose an algorithm for LOS assessment on an hourly basis. The proposed algorithm exploits various features from big data (crowdsourced Waze user alerts and speed/travel time variation) to perform LOS classiﬁcation using machine learning models. Three categories of model inputs are introduced: Basic statistical measures of speed; travel time reliability measures; and the number of hourly Waze alerts. Data collected from ﬁxed location sensors were used to calculate ground truth LOS. The results reveal that using Waze crowdsourced alerts can improve the LOS estimation accuracy by about 10% (accuracy = 0.93, Kappa = 0.83). The proposed method was also tested and conﬁrmed by using data from after coronavirus disease 2019 (COVID-19) with severe trafﬁc breakdown due to a stay-at-home policy. The proposed method is extendible for freeways in other locations. The results of this research provide transportation agencies with a LOS method based on crowdsourced data on different freeway segments, regardless of the availability of traditional ﬁxed location sensors.


Introduction
Intelligent transportation systems (ITS) are essential for assessing the state of traffic. ITS traffic measurements can be used in different applications, such as traffic operations, road work planning, assessing traffic queues, and congestion management. The United States Highway Capacity Manual (HCM) defines six Levels-of-Service (LOS) for estimating the traffic performance and state. The HCM provides analytical methods for assessing LOS from traffic density and highway characteristics [1]. The traffic density, speed, and flow are key components of LOS assessment [1][2][3]. The Department of Transportations (DOTs) and transportation agencies usually want real-time or historical hourly traffic status and LOS data for different freeway segments.
Traditionally, traffic data (speed, travel time, flow, and density) are collected by a variety of fixed location sensors, such as loop detectors, remote traffic microwave sensors (RTMS), magnetic sensors, laser sensors, video images, and License Plate Recognition (LPR) systems [4,5]. Fixed location data collection methods are typically expensive and have a limited network coverage. In recent years, data-driven ITS has led to multisource, high-performance, and powerful solutions in transportation systems [6]. Data collection

Traffic Status and LOS Assessment Methods
Most transportation agencies and DOTs focus on the density or volume to capacity (V/C) ratio to assess LOS. HCM uses the density to define LOS for freeways and multilane highways. It also defines LOS for intersections using a metric called control delays [1,30]. Traditionally, studies use one or a combination [31][32][33] of parameters, such as the speed [34], flow [32,35], and density [36,37], to explain the traffic status and LOS. Previous studies have used different data sources, such as sensor data [38], probe vehicles [32,39], camera images and videos [40], CVs [2,41], and simulation [2,35,37,41]. In terms of the methodology, statistical modeling [37], Neural Networks [38,39], Kalman Filters [39], Image Processing [42], and Machine Learning [32,40] have been widely used.

Travel Time Reliability
Previous studies have captured useful information from speed and travel time variability and reliability to determine the traffic status and performance. These studies used statistical measures, such as the average, standard deviation, percentiles, and range [43][44][45]. Additionally, the relation between the speed deviation, travel time variability, planning time index (PTI), and buffer time index (BTI) with the V/C ratio has been explored in prior literature [43,46].

Alternative LOS Methods
The Strategic Highway Research Program 2 (SHRP2) Reliability Project L08 discussed supplemental methods for LOS measurement. This project used a density-based definition of LOS to form the distribution of LOS and presented a distribution instead of a single value for LOS [47]. This study proposed an innovative approach for LOS based on travel time reliability perspectives. The travel speed range, the most restrictive condition, and the travel time value were introduced in this project [47]. Travel time reliability and variability are measures of service quality [48].
Pulugurtha and Imran (2020) and Kodupuganti and Pulugurtha (2019) modeled the LOS of freeways and urban links using travel time variability indicators such as the planning time index (PTI), buffer time index (BTI), average travel time, and 95th percentile. They suggested a threshold travel time reliability to assess LOS [49,50]. Singh et al. (2019) used Wi-Fi probe data to develop LOS thresholds based on travel time reliability and variability indices [30]. In a different approach, Altinatasi et al. (2016) used the average speed of Floating Car Data (FCD) to quantify LOS [34]. Moreover, Khan, Dey, and Chowdhury (2017) used simulation and artificial intelligence to assess LOS based on different CV penetration rates [2]. Table 1 presents the most relevant studies that have proposed alternative methods for LOS assessment.

Waze Data
Crowdsourcing enables researchers to use big data collected from road users, probe vehicles, bicycles, and pedestrians [21,51]. This research uses Waze crowdsourced alerts and speed/travel time data as the primary crowdsourced data source. The company Waze analyzes the app users' location to provide speed and travel time data. Waze also provides different event reports (congestion, incidents, severe weather, and road construction). Prior studies have explored and verified Waze incident report and travel time data to assess the reliability and coverage [20,29,[52][53][54]. A recent study by Li et al. showed that Waze incident alerts are spatially correlated with police crash reports (PCR) and that Waze provides a broader coverage than PCR [25]. Several previous studies have also used Waze alerts data for applications such as accident clustering [55], safety hotspot detection [25], incident detection [22], and improving dynamic traffic lights [56]. A more recent study also verified the quality of Waze speed data on surface streets [21]. Table 2 summarizes prior studies using Waze data.

Gap in the Literature
As previously discussed, HCM density-based LOS were given the most attention in prior literature. Moreover, some studies used travel time and speed variability for determining LOS. Crowdsourced data have not been used to determine LOS. This study addresses a gap in integrating crowdsourced data (Waze incident report and speed data) for LOS assessment. This study's results can help agencies quantify LOS for different segments, without installing new fixed location equipment.

Data
This part describes the primary datasets used in this study. This study used Waze speed/travel time and Waze alert data for the LOS analysis. In the following, the Waze speed/travel time, Waze crowdsourced alerts, and fixed location data will be presented. Then, the study area will be introduced.

Waze Speed and Travel Time Data
The travel time and traffic speed for specific roadway segments are data sources that Waze shares with partners through the Waze for Cities (WFC) program. Waze obtains app users' kinematic information in specified segments to calculate and report the speed and travel time. If no user is passing in that time interval or segment, it reports historical speed and travel time data. Waze implements a tool called "traffic view" that allows transportation agencies and DOTs to specify a list of road segments. Authorized users can add links based on their priority or needs to the watch-list. Subsequently, real-time travel time/speed data for the predefined road segments are available at a one-minute level. Authorized users can use these data in real-time or archive them in a JavaScript Object Notation (JSON) format for further analysis. The archived JSON file for each time interval includes travel time, segment length, and geospatial information for all predefined segments in that time interval.

Waze Crowdsourced Alert Data
User report data are other valuable crowdsourced data that Waze provides to partners, which are referred to as alerts. Waze alerts can be used in different analyses, such as incident detection, hot spot clustering, and end of queue prediction. Waze users can report predefined incident types in the Waze App while traveling. These alerts include accidents (major or minor), traffic jams (heavy, moderate, standstill, or light), hazards (severe weather, stopped cars, or road potholes), construction, and closed roads. Users can also verify existing reports on the road. Waze shares all users' reports through the WFC program. Waze alerts data include the incident unique ID, time, spatial coordinates, direction, reliability, and confidence level of the reported alert. Waze partners can use real-time alerts or archive them in an Extensible Markup Language (XML) format for their analysis. This study had access to Waze alerts for Tennessee State.

Fixed Location Data
The Tennessee Department of Transportation (TDOT) uses Radio Data System (RDS) sensor data, which provide traffic information such as the traffic count, speed, and occupancy in 30-s time intervals. RDS stations are located on freeways close to four major cities, including Nashville, Memphis, Knoxville, and Chattanooga in Tennessee. This study uses the traffic volume (flow) from RDS data and traffic speed from Waze to calculate the density and LOS. The estimated LOS can be used as ground truth data. This will be elaborated on in the methodology section.

Study Time and Area
To quantify hourly data and assess the LOS for freeways, a study area was designated in Knoxville, Tennessee. A segment on the Interstate 40 (I-40) highway at westbound mile marker 385 was selected ( Figure 1). The study segment length is about 1.5 miles (~2.4 km).
The speed limit for this segment is 65 mph (~105 km/h). This location was selected for two reasons: (1) The variability of traffic and LOS during the hours of the day, and (2) the availability of roadway sensor data (flow) to calculate different ground truth LOS. orated on in the methodology section.

Study Time and Area
To quantify hourly data and assess the LOS for freeways, a study area was designat in Knoxville, Tennessee. A segment on the Interstate 40 (I-40) highway at westbound m marker 385 was selected ( Figure 1). The study segment length is about 1.5 miles (~2.4 km The speed limit for this segment is 65 mph (~105 km/h). This location was selected for tw reasons: (1) The variability of traffic and LOS during the hours of the day, and (2) t availability of roadway sensor data (flow) to calculate different ground truth LOS.  One month of data, representing 1 October 2019 to 31 October 2019 (744 h), were selected to train the methodology. The world faced a significant challenge in 2020 from the coronavirus disease 2019 (COVID-19) pandemic [57]. Stay-at-home is known to be an effective policy [28] for preventing the spread of COVID-19 in the US and led to a major breakdown in mobility in March and April 2020. The traffic also recovered by about 90% in August 2020. Therefore, two months of data, consisting of 16 March 2020 to 15 April 2020 and 1 August 2020 to 3 August 2020 (overall 1488 h), were collected to test the final method. These two months were selected to test the method in both a normal and abnormal situation.

Methodology
The methodology used in this paper combines raw crowdsourced speed and user report data to obtain the hourly LOS-based traffic status. This methodology uses the speed variation, travel time reliability, and user alerts in the selected segment to define measures of traffic conditions. Here, the Waze speed/travel time and crowdsourced alerts are used as the primary data source in the study, which will be elaborated on in the following sections. Unlike some previous studies, this method does not solely depend on the average speed [34] or density.
This section provides more details about the proposed algorithm of this study. As shown in the framework of the study (Figure 2), the different steps of the proposed method are as follows:

•
Step 1: Data collection, which includes archiving Waze data and traditional fixed location sensor data, as well as preprocessing and normalization; • Step 2: Extract model inputs, which includes statistical measures, travel time performance measures, and crowdsourced Waze alerts; • Step 3: Calculating ground truth LOS, using fixed location sensors, and labeling observations of Waze input data with the corresponding ground truth data; • Step 4: LOS assessment, by performing different machine learning methods. This part includes feature selection, cross validation, and selecting the preferred method.

•
Step 2: Extract model inputs, which includes statistical measures, travel time performance measures, and crowdsourced Waze alerts; • Step 3: Calculating ground truth LOS, using fixed location sensors, and labeling observations of Waze input data with the corresponding ground truth data; • Step 4: LOS assessment, by performing different machine learning methods. This part includes feature selection, cross validation, and selecting the preferred method.

Step 1: Data Collection
Waze continuously generates a massive amount of data. The first step in such a study is to archive Waze speed/travel time and Waze alert data. A Python code was implemented to capture crowdsourced alert, speed, and travel time data for 1-min time intervals. Employing real-world raw data can always present challenges, such as missing values or noise. In the next step, data were preprocessed by cleaning and removing possible errors. Possible missing values and outliers were removed/imputed. Next, RDS traffic volume data were collected to calculate the hourly traffic flow and LOS ground truth, which will be elaborated on in Step 3.

Step 2: Model Inputs
As explained, previous studies have explored the variation of speed/travel time to capture LOS. This study combined different speed and travel time variation indexes with crowdsourced data to assess LOS. Multiple indices were calculated as the inputs of the classification model. This paper divides these indicators into three categories, as follows. Each index will be elaborated on in the following paragraphs.

•
Basic statistical measures, including the average speed, standard deviation, range, coefficient of variation, standard error, percentiles (25th, 50th, and 90th), and interquartile range. Crowdsourced data, including the number of users' accident, jam, and hazard reports in the Waze alerts data.

Basic Statistical Measures
Pertaining to speed variation, different statistical measures were considered. As discussed, speed variation has been considered in prior studies [2,34,47,48]. All these measures were captured and measured during each period (in this study, hourly). Table 3 provides the different statistical measures used in this study.
where v i is the speed and n is the number of observations in each time interval Here, rank is ordering the dataset from smallest to largest and finds the value with the k 100 (n + 1) index (6) Interquartile Range (IQR) where Q 3 is the 75th percentile and Q 1 is the 25th percentile of v i

Travel Time Performance Measures
To analyze the travel time variability for each time period, the following well-known travel time performance measures were also calculated. It should be noted that all the travel time reliability indexes were derived based on a one-hour aggregation level. The travel time performance measures are as follows. Table 3 also presents the different travel time performance equations (Equations (8) to (10)).

•
The

Crowdsourced Data
This study incorporated crowdsourced data along with the speed and travel time variability. Here, the number of Waze user reports (alerts) in each period (one hour) for the study area was calculated. This number was then used as an input for the final model of LOS assessment.

Step 3: Ground Truth LOS
LOS is a widely used performance measure of the quality of service for a road segment. The HCM identified six LOS categories for freeways and highways based on density and road characteristics. HCM employs the traffic density as the primary measure of LOS for freeway segments [2,34,41,58,59]. Table 4 presents the density pertaining to each LOS [1]. In this study, traffic flow (from RDS sensors) and speed (from Waze) were used to calculate the hourly traffic density. The calculated density was used to obtain hourly LOS based on Table 4. The calculated LOS was used as the ground truth. The hourly input data were also labeled with ground truth values, which were used in the LOS model presented in the next section.

Step 4: Machine Learning Methods
To accomplish the study objectives and estimate hourly LOS using crowdsourced data, machine learning classification methods were used. In this study, a variety of machine learning algorithms were tested. Among seven methods (Random Forest, Support Vector Machines, K-nearest Neighbor, Decision Tree, Boosted Tree, Naïve Bayes, and Multinomial Logistic Regression), the three methods with the highest accuracy were selected and are reported in this paper. These are as follows: • Random Forest (RF): RF is an ensemble classification method that combines several random decision trees. In this method, all trees are built independently. Then, it classifies the data based on the majority of votes of all trees; • Support Vector Machines (SVM): SVMs are well-known margin-based classification methods. For each class, the SVM algorithm finds the optimal support vector that provides the maximum distance to other classes. By calculating the optimal support vectors, the algorithm can identify the boundaries and classify the data; • K-Nearest Neighbor (KNN): KNNs are non-parametric methods that are widely used for classification. All training data are considered in an n-dimensional feature space (n = number of input features) in this method. For each observation, the algorithm looks for the k (a predefined constant) nearest neighbors based on the Euclidean distance. Then, it assigns the category based on the most frequent label of the neighbors.
Since this study implemented different machine learning methods, they had to be compared to find the preferred model. The classification accuracy and Cohen's kappa coefficient were used to choose the preferred model and features. The accuracy captures the ratio of correctly classified predictions (LOS in this study) in comparison to ground truth data. Kappa is another classification performance measure that calculates how close the classified instances are to the labeled ground truth. Kappa eliminates the correct predictions occurring by chance. Kappa is useful when the data are unbalanced due to the number of observations in each category. It should be noted that the higher the accuracy and Kappa value, the better the performance of the method. The accuracy and Kappa can be calculated using the following equations: where, Pr(a) is the ratio of correct classification or accuracy (Equation (12)), and Pr(e) represents the probability of success due to chance.

Results
This section first provides the descriptive statistics for all input variables. Then, the machine learning model results are presented. It should be noted that R programming language (version 4.0.0) was used for all analyses and visualization presented in this section. It should also be noted that missing values in the datasets represented less than 1% of the total population and were therefore removed from the dataset. Additionally, outlier values in the speed dataset represented less than 1% of the total; these were replaced with the median speed value.  ). Furthermore, the number of Waze alerts has a range of 0-101 hourly alerts. This suggests that some of the measures require normalization to remove bias in the models. Therefore, some of the speed measures (average, maximum, minimum, and percentiles) were normalized to improve the dataset quality and prevent an imbalance bias of the dataset.  Figure 3 presents a boxplot of the number of hourly crowdsourced alerts for each time of day. It shows that, typically, during the daytime and peak hours, there are a higher number of alerts than during night hours. Furthermore, Figure 4 shows a boxplot of the number of alerts, average speed, TTI, BTI, and PTI in each LOS category. This figure indicates that from LOS A to F, the range of the number of Waze alerts, TTI, BTI, and PTI for each LOS increases. On the other hand, the average speed decreases. Moreover, the range of measures in each LOS category is different. The results suggest that these features can be beneficial in describing the traffic status and LOS. Figure 3 presents a boxplot of the number of hourly crowdsourced alerts for each time of day. It shows that, typically, during the daytime and peak hours, there are a higher number of alerts than during night hours. Furthermore, Figure 4 shows a boxplot of the number of alerts, average speed, TTI, BTI, and PTI in each LOS category. This figure indicates that from LOS A to F, the range of the number of Waze alerts, TTI, BTI, and PTI for each LOS increases. On the other hand, the average speed decreases. Moreover, the range of measures in each LOS category is different. The results suggest that these features can be beneficial in describing the traffic status and LOS.

Model Training and Hyperparameter Tuning
To classify LOS, this study employed a variety of machine learning techniques. Among the tested methods, the highest accuracy methods are reported in this paper, which were SVM, RF, and KNN. As previously mentioned, one month of data (October 2019) was selected for the training and validation datasets. The stratified k-fold cross-validation technique was used for all three techniques (SVM, RF, and KNN) to remedy the overfitting problem, reduce the impact of unbalanced label frequencies, and maximize the use of data for both training and testing. In this cross-validation technique, the datasets were randomly divided into equal k-folds with approximately the same number of instances. One-fold was used as the validation set, and the remaining k-1 folds were used for training. Each fold was used once as the validation dataset. Then, the final accuracy and Kappa value were calculated as the average of k validation results. The k-fold cross-validation technique enabled us to select tuning hyperparameters and increase the classification accuracy. For this purpose, different values of grids of hyperparameters were used in each method to tune hyperparameters and select the best model. To account for overfitting, this study limited the hy-

Model Training and Hyperparameter Tuning
To classify LOS, this study employed a variety of machine learning techniques. Among the tested methods, the highest accuracy methods are reported in this paper, which were SVM, RF, and KNN. As previously mentioned, one month of data (October 2019) was selected for the training and validation datasets. The stratified k-fold cross-validation technique was used for all three techniques (SVM, RF, and KNN) to remedy the overfitting problem, reduce the impact of unbalanced label frequencies, and maximize the use of data for both training and testing. In this cross-validation technique, the datasets were randomly divided into equal k-folds with approximately the same number of instances. One-fold was used as the validation set, and the remaining k-1 folds were used for training. Each fold was used once as the validation dataset. Then, the final accuracy and Kappa value were calculated as the average of k validation results. The k-fold cross-validation technique enabled us to select tuning hyperparameters and increase the classification accuracy. For this purpose, different values of grids of hyperparameters were used in each method to tune hyperparameters and select the best model. To account for overfitting, this study limited the hyperparameters for each machine learning algorithm as follows:

Model Selection
Here, three different models were estimated to elaborate on the impact of adding crowdsourced data in terms of the LOS assessment accuracy. By comparing these models using different machine learning methods, the preferred model could be selected. The proposed models are as follows: Next, 10-fold cross validation was performed for the three models. Figure 5 displays the result of cross validation for the different machine learning techniques for each model. It can be inferred that, for most methods, adding a statistical measure (Model II) improves the accuracy and Kappa in comparison to Model I. Moreover, adding crowdsourced data to the input measures (Model III) increased the performance of all methods. Based on this step, it can be concluded that crowdsourced data improved the LOS classification performance. All three types of input measures were used in the final model.
In order to assess the sensitivity to the number of folds in cross validation, the values of k (3, 5, and 10) were selected. Table 6 compares the selected classification methods with 3, 5, and 10 cross validation folds for Model III (using all inputs). Table 6. Summary of classification methods with 3-, 5-, 10-fold cross validation (train dataset). It can be inferred that, for most methods, adding a statistical measure (Model II) improves the accuracy and Kappa in comparison to Model I. Moreover, adding crowdsourced data to the input measures (Model III) increased the performance of all methods. Based on this step, it can be concluded that crowdsourced data improved the LOS classification performance. All three types of input measures were used in the final model. In order to assess the sensitivity to the number of folds in cross validation, the values of k (3, 5, and 10) were selected. Table 6 compares the selected classification methods with 3, 5, and 10 cross validation folds for Model III (using all inputs).  The result of this study shows that machine learning techniques are capable of determining LOS. The RF accuracy of the method with 3, 5, and 10 cross validation was 0.91, 0.93, and 0.92, respectively. Additionally, all Kappa values for RF were above 0.8, which is acceptable for a classification with six different categories. Among the selected machine learning techniques, the RF performed the best result using model III inputs. Here, LOS calculated from RDS data was used as the ground truth. The best RF model (with the highest accuracy and kappa values) was selected to be evaluated with the test dataset. The hyperparameters for the best model (selected RF) included a number of trees of 250, maximum number of features of 2, and maximum tree depth of 3.

Test Result
As mentioned earlier, two months of data in 2020 were collected to test the methodology. The first month (March 16 to April 15) is known to have exhibited a major breakdown in traffic and mobility due to the stay-at-home policy regarding the COVID-19 outbreak. The second month (August) was selected since the traffic breakdown was slightly recovered (about 90%) to normal traffic. The selected RF model was evaluated by using the test data. Table 7 shows that the test result is close to training datasets. The proposed method was also applied to and tested in other segments of I-40 in the Knoxville area, and the result showed a similar accuracy. It shows that the proposed method is extendible to other locations.

Sensitivity Analysis
The sensitivity of the preferred model (Model III using RF) to different hours of the day was investigated. In the RF model, the corresponding accuracy of each hour of the day was calculated. The accuracy values displayed a range from 0.92 to 0.94 during 24 h of the day. The result highlights that the LOS estimation model is not dependent on the time of day. This method can be used for both peak hours and non-peak hours to estimate the traffic state and LOS. Additionally, the classification accuracy for each ground truth LOS was calculated based on the confusion matrix. Similar to the hourly analysis, each LOS category's accuracy did not deviate from the total accuracy (0.93). This suggests that the proposed method results are not biased due to the frequency of each LOS category.

Variable Importance
This study suggests that crowdsourced data can improve the LOS classification accuracy. Accordingly, variable importance analysis was performed based on the preferred RF model ( Figure 6). The mean decrease in the Gini index was computed from the RF model. A higher value of this index indicates a higher variable importance. The average speed, number of crowdsourced alerts, TTI, BTI, minimum speed, PTI, and standard deviation (SD) of speed are the most important variables in determining the LOS, respectively. This result is consistent with previous studies in the literature that employ speed and travel time reliability measures when determining LOS thresholds. However, the number of crowdsourced alerts seems to impact the LOS prediction accuracy significantly more than TTI, BTI, and TTI. number of crowdsourced alerts seems to impact the LOS prediction accuracy significantly more than TTI, BTI, and TTI.

Limitations and Future Work
This study proposed a new methodology for estimating LOS using crowdsourced data and machine learning algorithms. However, there were some limitations to this study. This study did not consider the variability and sensitivity of the methodology regarding weather conditions. Additionally, the proposed method used crowdsourced data, travel time variability, and speed statistics measures to estimate LOS. The travel time reliability and speed statistics measures captured the temporal variability of speed and travel time; however, the spatial variation in speed was not considered. In future research, spatial variation can be used as an input variable for LOS assessment. The speed deviation from upstream and downstream segments can also be addressed in LOS estimation. To

Limitations and Future Work
This study proposed a new methodology for estimating LOS using crowdsourced data and machine learning algorithms. However, there were some limitations to this study. This study did not consider the variability and sensitivity of the methodology regarding weather conditions. Additionally, the proposed method used crowdsourced data, travel time variability, and speed statistics measures to estimate LOS. The travel time reliability and speed statistics measures captured the temporal variability of speed and travel time; however, the spatial variation in speed was not considered. In future research, spatial variation can be used as an input variable for LOS assessment. The speed deviation from upstream and downstream segments can also be addressed in LOS estimation. To this end, more complex methods such as deep learning can be deployed. Using deep neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) could enable future research to simultaneously capture spatial and temporal variation. Furthermore, in this study, Waze speed and alert data were used as the primary data source. In the future, other crowdsourcing methods (e.g., social media) can be examined when estimating LOS. Finally, this study used Waze alert counts regardless of the event type (jam, accident, and hazards). The impact of each type of event on traffic conditions and LOS should be evaluated in the future.

Conclusions
Crowdsourced data availability is increasing rapidly, and machine learning offers the opportunity to analyze it. This study proposed a new methodology to incorporate crowdsourced data in LOS assessment. The method was applied to a 1.5-mile (~2.4 km) segment of freeway on I-40 in Knoxville, Tennessee. Crowdsourced data from Waze were collected, and three categories of input measures (basic statistical measures, travel time reliability, and Waze crowdsourced alerts) were calculated. Machine learning techniques were performed to classify LOS on an hourly basis. Additionally, data collected from fixed location RDS sensors were used to calculate the traffic density and estimate the LOS ground truth using HCM density thresholds.
The results of this study highlight that crowdsourced data and machine learning techniques can be used to estimate LOS. The results revealed that using crowdsourced alerts as an input can significantly improve the model accuracy (about 10%). Moreover, the RF method showed the highest performance among other classification methods in training datasets (accuracy = 0.93 and Kappa = 0.83). Evaluating and testing the trained method also confirmed the classification accuracy. In this method, the LOS estimation accuracy value was relatively consistent among different times of day and LOS categories. Sensitivity analysis confirmed that the accuracy of this methodology does not deviate in traffic peak-hours or non-peak hours. The results also suggest that the average speed, number of alerts, TTI, and BTI are the most important variables in determining LOS.
This method helps to explore the traffic status of freeways without relying on fixed location sensors, times of day, or days of the week. The proposed method has the potential to be applied to different freeway segments to assess LOS. This method does not need fixed location sensors, potentially resulting in lower implementation and maintenance costs. Transportation agencies and DOTs can utilize this method for traffic operation purposes. This method can also analyze freeway traffic in locations outside of urban areas with no fixed location sensors. It benefits from crowdsourced data and can be applied for different time periods, such as hourly, daily, and traffic peak hours.

Data Availability Statement:
Restrictions apply to the availability of these data. Waze crowdsourced data was obtained from Waze with the permission of TDOT. RDS data was obtained from TDOT.