A Comprehensive and Effective Framework for Trafﬁc Congestion Problem Based on the Integration of IoT and Data Analytics

: Trafﬁc congestion is still a challenge faced by most countries of the world. However, it can be solved most effectively by integrating modern technologies such as Internet of Things (IoT), fog computing, cloud computing, data analytics, and so on, into a framework that exploits the strengths of these technologies to address speciﬁc problems faced in trafﬁc management. Unfortunately, no such framework that addresses the reliability, ﬂexibility, and efﬁciency issues of smart-trafﬁc management exists. Therefore, this paper proposes a comprehensive framework to achieve a reliable, ﬂexible, and efﬁcient solution for the problem of trafﬁc congestion. The proposed framework has four layers. The ﬁrst layer, namely, the sensing layer , uses multiple data sources to ensure a reliable and accurate measurement of the trafﬁc status of the streets, and forwards these data to the second layer. The second layer, namely, the fog layer , consumes these data to make efﬁcient decisions and also forwards them to the third layer. The third layer, the cloud layer , permanently stores these data for analytics and knowledge discoveries. Finally, the fourth layer, the services layer , provides assistant services for trafﬁc management. We also discuss the functional model of the framework and the technologies that can be used at each level of the model. We propose a smart-trafﬁc light algorithm at level 1 for the efﬁcient management of congestion at intersections, tweet-classiﬁcation and image-processing algorithms at level 2 for reliable and accurate decision-making, and support services at level 4 of the functional model. We also evaluated the proposed smart-trafﬁc light algorithm for its efﬁciency, and the tweet classiﬁcation and image-processing algorithms for their accuracy.


Introduction
Thousands of people lose their lives annually due to road accidents, which also cause many disabilities and injuries, and contribute to other catastrophes through environment pollution due to traffic congestion [1]. Economic and health aspects are most significantly affected by traffic congestion due to delays in the execution of daily tasks, causing a huge waste of time, effort, and focus, as well as other additional costs [2]. Therefore, traffic congestion has become a fundamental problem and a major challenge for most countries of the world. Advances in digital technology in recent years have an effective role to play in solving this problem. As an example, the Internet of Things (IoT) has transformed everything around us into smart things [3], with a unique identity and the ability to sense and collect information from the surrounding environment, and share this information using radio-frequency identification (RFID) and wireless sensor networks (WSNs) [4]. RFID identifies an object (for example, a vehicle) using a unique identifier 1.
A review of the literature related to traffic congestion.

2.
A comprehensive framework for a reliable, flexible, and efficient solution for traffic congestion problems.

3.
An algorithm for traffic lights for the efficient management of congestion at intersections and the evaluation of its efficiency.

4.
An algorithm for the classification of tweets and the detection of congestion in a particular area and the evaluation of its accuracy.

5.
An accuracy evaluation of an image-processing algorithm for congestion detection from images captured by drones.
The rest of this paper is organized as follows. Section 2 discusses the work related to smart traffic management, Section 3 presents the proposed framework, Section 4 describes a functional model of the framework, Section 5 discusses the experiments conducted and results obtained, and finally, Section 6 presents the conclusion of the study.

Related Work
This section discusses the approaches, techniques, technologies, and tools that have been suggested to overcome issues in traffic management. In the recent past, studies have been conducted to highlight many issues related to smart cities and these have emphasized the need for smart-traffic management [7][8][9]. To achieve this, many proposals have been made. In one such study, the use of data mining for classifying roads into six categories according to the traffic density (free, low, mid, high, very high, and extreme) was proposed [10]. Depending on the speed of vehicles and the average waiting time, this study predicted the expected congestion on a road and notified the end-user to select the appropriate route. Another study was able to predict traffic congestion by relating travel times with the vehicular load on the route [11]. Similarly, a study proposed a predictive road traffic management system based on the vehicular ad hoc network (VANET) architecture [12]. An algorithm to estimate the speed of a car from a video feed based on the image scale factor was also proposed [13]. More methods for addressing this issue using image processing can be found in [14]. Furthermore, techniques based on image processing and deep leaning for remote sensing images can be found in [15]. An overview of image processing algorithms and traffic management issues is presented in [16]. Applications of artificial intelligence (AI) in transportation systems have also been discussed in [17]. Ref. [18] studied the speed, acceleration, and direction of traffic flow using deep learning to manage the traffic flow.
In contrast, many studies have been conducted to investigate the role of sensors in traffic management. One study suggested deploying WSN and RFID on roads to create dynamic traffic lights [19]. However, the accuracy of these devices is affected by factors such as lighting, the random movement of vehicles, etc. Similar work, presented in [20], argued that with the help of a real-time intelligent central control unit, medical emergencies can be handled by pausing traffic in any lane to give priority to ambulances. This can be achieved by relying on RFID in cars and traffic lights. More such work on the application of sensors and RFID in smart traffic management can be found in [21]. A proposal to use scheduling algorithms to select the optimal path for an autonomous vehicle (AV) and suggestions to prepare special paths for this type of vehicle is discussed in [22]. Furthermore, an overview of AV is presented in [23]. However, the initial cost of new infrastructure is a major hurdle. The authors of one study [24] used micro-controllers and infrared (IR) sensors to detect traffic density and control the traffic signal to reduce traffic congestion and avoid unnecessary waiting times. A signal control project based on radio sensors with an Arduino low-power micro-controller has also been developed [25]. This controller works to capture and process information about traffic flow on roads in order to dynamically control the waiting time on the red light of a traffic signal. In a similar study, a method was presented of estimating the time of vehicles arriving at traffic lights based on their speed to inform drivers about street traffic status using computers in their cars [26].
Studies have also investigated the use of sensors for detecting accidents and traffic violations. One of the traditional ways to detect traffic violations is the use of surveillance cameras and radars. However, their accuracy is affected by weather conditions such as fog, heavy rain, and so on. Many researches have used modern technologies such as wireless devices, and drones to overcome these limitations of existing methods [27][28][29][30]. Generally, vehicles involved in an accident block the route, which leads to congestion. A European project employed a set of radio sensors to monitor the blocked and moving vehicles [31]. A study used Global Positioning System (GPS) technology to discover accidents by monitoring the speeds of vehicles to discover accidents and share this information with the accident control center, along with the location of the accident [16]. Another study made use of accelerometers for vehicles and GPS sensors in order to monitor the security and safety of commuters, and share the accident information with their relatives [32]. Ref. [17] employed smart devices in accident detection by implementing a special sensor for accident detection. Similarly, work described in [15] used the concept of dynamic time for accident detection through sensitive sensors in vehicles. A built-in circuit based on a micro-controller and sensors for the process of detecting the accident and then sending a signal about it has also been developed [33]. Another study [34] presented an accident control system that sends a text message after recognizing the occurrence of an accident through sensors that capture vibrations. The text message is sent to emergency services and contains the GPS location of the accident. Similar work can be found in [35]. Furthermore, other technologies, such as IoT [36], have been investigated for this purpose.
Monitoring and managing traffic using IoT has been proposed in the literature [37]. These studies generally rely on driver data to support decision-making in relation to traffic lights. It has been argued that as IoT connects physical things to the internet to build smart systems such as intelligent transportation systems (ITSs), communication among vehicles will create a new age of services in the land, sea, and air [38]. In fact, in the Internet of Vehicles (IoV), vehicles will act as sensing points, which bring more services, safety, and efficiency for transportation systems [39]. Moreover, the IoV can be used to provide sensors for environmental conditions, which can meet the needs of smart cities. It has also been suggested that integrating ITS with IoT can achieve the goal of smart transportation [40]. The authors of that study also suggested the use of fog computing. In another such study [41], an increase in location-based service (LBS) applications, especially in smart cities with the IoT, has been observed. Based on this observation, the authors proposed a new framework for merging many IoT services in the transportation sector to monitor public transport. At the same time, researchers have also highlighted the need for a new level of security and privacy to protect the data of users in such a merger of technologies [42]. Preserving privacy is a vital issue in smart systems, which has to be considered because it is the main threat faced by the IoT and its applications [43,44]. Many studies have classified attackers that pose threats to the privacy of users' data and have discussed their skills [45]. Nevertheless, surveys on methods and techniques for preserving privacy have also been discussed [46,47]. A complete overview for the IoT and its structure, layers, phases, applications, and future trends can be found in [48,49].
Using big data to address the congestion issue has also been proposed [50]. Ref. [51] discussed the challenges and drivers of using machine learning algorithms with big data in the transportation field, using Hadoop and Map-Reduce to manage these data and to perform data analytics. The result can be useful in many applications such as parking, vehicle sharing, automatic vehicle positioning, etc. Using fog computing to address these challenges has also been suggested [52]. Ref. [53] discusses the challenges of using fog computing in ITSs.
On the other hand, smart transportation has extended beyond traffic management and is addressing vehicular transportation in general. Studies have investigated harvesting energy using roads, through solar energy and vibrational energy generated by vehicles. This energy is utilized for lighting either roads and traffic lights, and can be used to supplement the electricity grid [54]. Many countries with a large number of electric vehicles have undertaken projects to automatically charge electric vehicles using such technologies [55]. Another study makes use of musical songs generated by roads to warn commuters about the safety of their ride [56]. Many countries, such as Taiwan and Korea, are using this in a bid to avoid accidents related to unsafe driving. Furthermore, vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) technologies have been used to exchange critical information such as accident information between vehicles, and between vehicles and devices installed on the streets [57][58][59]. In this case, roads are equipped with electronic emergency services and vehicles are connected to VANETS [57][58][59]. Reducing the electricity consumption of street lights by controlling street luminance based on traffic flow has also been studied [54]. Studies have also developed techniques that enable measurements of the weight of static or mobile vehicles [54,[60][61][62].
A summary of the literature reviewed in this study is presented in Table 1. It can be concluded that there is no single framework for addressing the reliability, flexibility, and efficiency issues of smart-traffic management. Therefore, a framework that acquires the traffic status of streets from multiple sources for reliability, supports the integration of different technologies for flexibility, and provides algorithms for efficiency is needed for smart-traffic management. This work intends to develop such a framework. We propose a comprehensive framework for smart-traffic management. The framework contains four layers. The first layer, namely, the sensing layer, uses multiple data sources to determine the reliability and accuracy of the traffic status of the streets, and forwards these data to the second layer. The second layer, the fog layer, consumes these data to make efficient decisions and also forwards them to the third layer. The third layer, the cloud layer, permanently stores these data for analytics and knowledge discovery. Finally, the fourth layer, the services layer, provides assistant services for traffic management. We also discuss the functional model of the framework and the technologies that can be used at each level of the model. We propose a smart-traffic light algorithm at level 1 for the efficient management of congestion at intersections, tweet-classification and image-processing algorithms at level 2 for reliable and accurate decision-making, and support services at level 4 of the functional model.
We also evaluated the proposed smart-traffic light algorithm for its efficiency, and tweet classification and image-processing algorithms for their accuracy.

Literature
Central Idea of the Work(s) [7,8] Highlight many issues related to smart cities in which the need for smart-transportation has been emphasized [10] Use of data mining for classifying roads into six categories according to density [13] Estimates the speed of a car from a video based on the image scale factor [14] Methods for addressing traffic congestion using image processing [15] Techniques based on image processing and deep leaning for remote sensing images [16] Overview of image processing algorithms and traffic issues [17] Applications of AI in transport systems [18] Deep learning to manage traffic flow [19] WSN and RFID on roads to create smart/dynamic traffic light [20] Real-time intelligent central control unit can deal with traffic emergencies [21] Application of sensors and RFID in smart traffic management [22] Uses scheduling algorithms to select the optimal path for an autonomous vehicle (AV) [23] An overview of AV is presented [24] Use micro-controllers and IR sensors to detect traffic density and control the traffic signal [25] A signal control project based on radio sensors with an Arduino low-power micro-controller [26] Estimating the time of vehicles arriving at traffic lights [27][28][29][30] Use modern technologies such as wireless devices, and drones to overcome limitations of surveillance cameras and radars [31] Employed a set of radio sensors to monitor blocked and moving vehicles [32] Use an accelerometer and GPS sensors to monitor the security and safety of commuters, and share the accident information with their relatives [33] Use a micro-controller and sensors for detecting accidents [34][35][36] Designed an accident control system that sends a text message after recognizing the accident [37] Monitoring and managing traffic using IoT [38] Argues that IoT connects physical things to the internet in order to build smart systems such as ITSs [39] Proposes using vehicles as sensing points to bring more services, safety, and efficiency for transport system [40] Integrating an intelligent transportation system (ITS) with IoT can achieve the goal of smart transportation [41] Found an increase in LBS applications, especially in smart cities with the IoT [42] Highlighted that a new level of security and privacy is required in the merging of IoT and smart transportation to protect users' data [43,45] Highlighted that preserving privacy is a vital issue in the smart systems [46,47] Surveys on methods and techniques for preserving privacy in smart transportation [48,49] A complete overview of IoT and its structure, layers, phases, applications, and future trends [50] Using big data to address the issue of traffic congestion [51] Discusses the challenges and drivers of using machine learning algorithms with big data in the transportation field [52] Uses fog computing to address smart transportation challenges [53] Discusses the challenges of using fog computing in ITS and highlights its contribution. [54] Harvesting energy generated on roads by vehicles for powering street and traffic lights [55] Automatically charging electric vehicles using energy generated by vehicles on roads [56] Makes use of musical songs generated by roads to warn commuters about the safety of their vehicle [57][58][59] Exchanges critical information such as accident information between vehicles, and between vehicles and devices installed in the streets [60][61][62] Develop techniques that enable the measurement of the weights of static or mobile vehicles

Proposed Framework
The proposed framework is based on four different layers that are integrated with each other to solve the problem. At each layer, several different solutions are offered in order to provide a high degree of reliability, efficiency, and flexibility. Figure 1 shows the general structure of the proposed framework.

Sensing Layer
This layer collects data for the higher layers through various IoT infrastructures, such as wireless network sensors, radio identifiers, and so on. These infrastructures can be set up on streets and intersections, in addition to using devices embedded in vehicles and in the smartphones of drivers and passengers. As an example, social networking applications that are nowadays embedded in the dash panels of smart cars and are being used in many developed countries can be used as a source of data.

Fog Layer
This layer distributes its nodes densely across different places on main roads within cities and at intersections. These nodes can be fixed, such as a traffic lights or a side road unit. They may also be mobile, such as on a drone or a special vehicle that provides computing services. Each fog node is responsible for a small cell (or area) in which all the data generated from that area are collected, and the node processes those data, filters and reduces them, before sending them to the cloud. This is performed in order to reduce the amount of data to be transmitted, to eliminate useless data, and to thwart the inherent latency of the networks. At the same time, fog nodes provide fast computing services and process the received-data in real time. Consequently, they are able to respond in real time to any situation that requires an immediate response.

Cloud Layer
This layer collects and permanently stores all the information provided by the previous layer to generate a huge dataset based on continuous monitoring and tracking by the layers beneath it.Thus, these data can be relied on to train special models to discover the knowledge and behaviour of traffic, and the nature of each street or area across different times of the day, week, month, and year. This layer also has an important role to play in the process of managing the integration between heterogeneous applications, services and devices. Furthermore, this layer is responsible for maintaining the privacy and security of users' data.

Services/Applications Layer
This layer plays an important role in solving the traffic congestion problem as it can provide services such as pre-booking of car parking, supporting mass transportation, especially for students and employees, in addition to applications for medical emergencies, as well as applications that contribute to the sharing of information among users about congestion in specific areas. Finally, various LBS applications, such as searching for points of interest or the shortest or fastest path, tracking, and other applications, can be realized in this layer. This layer enables service providers to add their services and their applications. The data generated by these applications and services will also serve as a source of new data in the cloud.

Functional Model of the Framework
The functional model of the framework is presented in Figure 2. For each layer of the framework, there is a corresponding level in the functional model, highlighted in yellow in  To illustrate how these levels collaborate to solve traffic management issues, the following scenario is considered. If there is a traffic jam in a specific street, then the first level (smart traffic light) will give extra time to this street to ease the congestion. The second level will notice the condition of the street as congested and will send alerts to vehicles to stay away from this street currently and reroute their journey, which will further contribute to the speedy resolution of the problem. In this case, the third level is responsible for storing the data, and it will be noticeable during data analytics if this scenario occurs repeatedly. This will raise the alarm to solve this chronic problem by, for example, diverting the traffic at a specific time, or building a tunnel or bridge, and so on. Finally the fourth level, through various applications, such as the parking service, will help to alleviate the problem by restricting traffic to the congested area by providing stoppage places, and so on.
This framework is comprehensive because it is based on the principle of integrating many solutions together to create an effective solution. For example, if there is congestion on one side of a traffic signal, the smart-signal in the first level will increase the time of the green signal on the congested side to allow more vehicles to cross. However, at the same time, if the congestion is large and the waiting queue is long, the second level will notice that there is congestion at the street level and not only at the intersection. It will send alerts to drivers to avoid coming to this street and suggest them to take alternate routes to their destination. This will contribute significantly to resolving the congestion. The third level will store all the information according to the place and time, and after collecting huge volumes of these data, it is possible to discover the behavior of traffic on streets that are often crowded. Hence, this information will support decision-makers in taking action at the city level in re-planning the movement and flow of traffic by building bridges or tunnels. Finally, at the fourth level, assistant applications contribute towards reducing the causes of congestion by facilitating tasks, such as looking for parking spaces in the city center (which otherwise cause very slow traffic movement) or using private vehicles for schools and work instead of mass transit (which otherwise cause crowding at peak times). In addition, as proposed in the framework, the diversity of data sources at each level, for the purpose of reliability, is also achieved in the functional model. In first level, we rely on cameras, WSNs, and RFIDs to know the number of vehicles in the street. In the second level, we rely on drones, tweet analysis, and a smartphone application that enables commuters to know the traffic status of the streets.

Level 1-Traffic Congestion at Intersections
The problem of traffic congestion at intersections is the seeding level of a traffic jam, especially at peak times. The solution is to rely on intelligent, dynamic traffic signals adapted to the behaviour of traffic on each side of the intersection instead of static traffic lights that give a fixed time slot to traffic on each street. An alternative solution is building bridges or tunnels and this is always the best solution if available, but it is very expensive and may not be possible in all situations.
To create a smart traffic light, we need to know the number of vehicles in each street. We can acquire this information from three different types of sources. The first option is to rely on control cameras (either a camera in each street at the intersection or a moving camera that takes successive images). This is considered the easiest option in terms of its application, cost, and maintenance. This does not affect traffic during its installation or maintenance. However, the accuracy of the images may be influenced by luminance factors such as weather and lighting. Nevertheless, the threshold for tolerating the noise can be customized as we are interested in estimating congestion based on the size and number of vehicles. The second option is to rely on sensors on the roadside and the center of each road, separated by different distances, to measure the number of vehicles passing through. This ensures accuracy but may disturb traffic flow during installation and maintenance. The third option is to rely on assigning a unique radio ID to each vehicle, so that each vehicle connects to the nearest receiving unit (may be a traffic light). The location of each vehicle and the number of vehicles on each route can therefore be ascertained. Although this is effective, it may not be available in many countries for privacy reasons.
It is also possible to use more than one of these sources together to increase the level of availability and reliability, knowing that all the data will reach the fog node responsible for the intersection (which may be a traffic signal light) in order to process the data such as the images captured by cameras, calculate the congestion level, the number of vehicles on each route, and so on. Accordingly, the timing of the signals can be controlled so that a street with a huge number of waiting vehicles is given additional time at the expense of a side where there are no or few waiting vehicles. Indeed, starvation of each street is avoided by limiting the waiting time for any route to a specific amount. Moreover, all data pertaining to the congestion in each route and at each intersection are sent to the cloud for later use in the future planning of the city network and the flow of vehicles within it.
In our implementation of the framework, we used cameras at traffic lights, WSNs, and RFID to collect the data about traffic flow from the streets, i.e., the number of cars. This data are aggregated and fed to our proposed algorithm for a smart traffic light (STL), discussed below in Listing 1.

Level 2-Traffic Congestion on Main Streets
Congestion in main streets is usually due to obstacles such as excavations, traffic barriers, traffic accidents, rain and floods, and so on. Therefore, it is more important to discover the cause of congestion instead of just noting its occurrence. The solution is to then distribute the vehicular load of the congested street by informing the commuters about the condition of the street so that they can reroute to reach their destinations. Traffic lights can also be used to slow down the flow towards the congested street. The integration between different approaches listed below provides a higher level of accuracy and reliability for the information center to make such decisions.

1.
Using information from service providers, such as Google Maps, OpenStreetMap, and so on, to determine the traffic status of streets and the degree of congestion, without specifying the cause or type of congestion. In our implementation of the framework, we used Google Maps.

2.
Using crowd sourcing to exchange data between vehicles through a specific application in the event of any congestion or abnormal traffic on the street. This approach is more accurate as the incoming data are specific to the event, but it requires a high level of awareness from commuters in their use of the service. In future, this service can be automated within smart vehicles. In our implementation of the framework, we built an application for this service that works on smartphones, and contributes to the dissemination of this information to allow others to confirm the news.

3.
Drones distributed at central vital points can be used to capture and share images to the information center to be processed automatically. This can provide information on the size of the congestion and help in distributing the vehicular load to other streets. In our implementation of the framework, we used an image processing algorithm to calculate the severity of the congestion that has been previously published [63].

4.
Social media is one of the fastest ways to spread news today. In our implementation of the framework, we used Twitter. We proposed two algorithms based on text exploration and natural language processing to detect traffic congestion. The objective of first algorithm is to classify tweets related to the vehicular traffic in a particular street or area, whereas the second algorithm detects and reports the events related to congestion.
Listing 2 shows the algorithm used to classify the tweets, which is described below.

2.
Filter the collected tweets and extract only those that are related to the specific area.

3.
Remove repeated tweets to reduce the total number of tweets.

4.
Pre-process each tweet in the selected group.
(a) Clean the tweets of special letters, punctuation, letters from other languages, numbers, etc., and replace them with empty spaces.
Encode the tweet to the list of symbols in each blank space. (c) Remove the lexemes that do not affect the result of the classification (such as articles, pronouns, etc.) using the NLTK library [64]. (d) Normalize the text. This is needed when there are many similar characters with the same meaning. (e) Stammering-return the word to its original root by removing some characters (such as the prefix or suffix) using Porter's algorithm [64]. Note that the above listed steps reduce the size of the number of different words or terms by more than 60% compared to the original text.

5.
Extract features and classify the tweets.
(a) One approach that can be used is based on machine learning to extract the features of each tweet by calculating the TF-IDF factor for each term. IDF refers to the number of tweets containing the term T divided by the total number of tweets. The algorithm uses IDF*TF, where TF is calculated as shown in Equation (1) TF(t) = number of times t appears Total number of terms (1) and IDF is calculated as shown in Equation (2) IDF(t) = ln( Total number of docs Number of docs with term t in them ) The second approach is based on the DMOZ dictionary [65]. This dictionary is used to calculate the number of words related to the vehicular traffic and if they are larger than a threshold value, the tweet is classified in that category. (c) The third approach is to find the POS for each term in the tweet. We can then measure the similarity with the root of a term for vehicular traffic and its branches in a WORDNET Tree in NLTK [64]. If the result is more than the minimum, the tweet is classified in that category. (d) Finally, an ontology of important words can be built based on previous traffic data so that the most important words that symbolize vehicular traffic are identified. Similarly, the algorithm shown in Listing 3 is used to identify tweets pertaining to vehicular traffic disruption, which is accomplished as follows:

1.
Create a list of terms that symbolize a congestion or disruption of traffic flow in any street or area, such as an accident, police checkpoint, traffic stop, congestion, fire, heavy rain, repair work, and so on.

2.
Check if the number of tweets containing one of the above listed terms is more than the minimum, which means that there is such an event in that area or street. Commuters in other areas driving to the event-area will be notified to avoid heading towards it, which will reduce the size of the problem, prevent it from getting worse, and speed up the process of resolving it.

Level 3-Analysis of Historical Data
This level performs the analysis of historical data pertaining to traffic flow to discover knowledge for city planning. Many technologies and applications have been employed in the first and second levels of the model. This implies that huge volumes of data are consequently collected in the cloud. These data provide a wealth of information that must be worked on to obtain the valuable information that will have a significant impact on improving the level of services and support in traffic management. The main objectives of this level are:

1.
Applying machine learning algorithms to classify and aggregate data on congestion based on different criteria such as days, regions, and congestion level, as well as the cause of congestion. This enables the decision maker to develop correct strategies to avoid or resolve these issues.

2.
Apply machine learning algorithms to classify and predict the percentage of congestion in a specific area at a specific time in the future and work in advance to avoid it.
It is worth mentioning here that this level will be implemented in future work when we have collected enough data for analysis over time. Nevertheless, this level has the following steps:

First Step-Data Collection
The first step is to collect data coming from the applications and devices used in the first and second levels of the model, which mostly measure the level of congestion at a specific time and place. We also rely on the accident database, which is mainly registered with government agencies.
The data received from WSNs are initially collected within the cache in the fog node and then sent periodically to the cloud to be stored in the database permanently in the format shown in Figure 3. The congestion can be represented by any one of the following values-normal, moderate, heavy, stop. The data received from smart traffic lights are stored in a similar format. However, the data from mobile applications are stored in the format shown in Figure 3. The reason field can have any one of the following values-students leaving, employees returning, barrier, weather conditions, excavations and maintenance, accident. The data received from the LBS, such as Google Maps, are shown in Figure 3. The data received as a result of tweet analysis are stored in the format shown in Figure 3. Finally, we merge all the data into one database to reduce the volume of data by removing duplicates based on the location, time and date, and so on. Moreover, the database is also used for storing the data pertaining to accidents.

Second Step-Data Processing and Cleaning
This step aims to process and clean the data for standardization and normalization. First, we process null values by either deleting the entire record or replacing the null with the mean or median of the values of other records. Second, outliers are also treated in the same way as null values. They are deleted or replaced with the mean or median value of other records in order to isolate their negative impact on the accuracy of the results and learning. Third, we convert the nominal or categorical data into numbers to enable their processing. As an example, the level of congestion is expressed in numerical values ranging from 1 to 4, with 4 being highest level of congestion. Subsequently, scaling is performed so that the value of each column is scaled between 0 and 1. Furthermore, we delete dummy data, using the One-Hot Encoder algorithm [64]. For example, after encoding the column for the congestion level, one of the previously mentioned four columns can be omitted because it is possible to deduce its value without its presence. In other words, if the value of the mean, the extreme, and the stop are equal to zero, then this means that the normal is one, without the need for it to exist in the database. Next, we process the data for standardization. For this purpose, we can use one of two methods: normalize or standardize, as shown in Equations (3) and (4).
Furthermore, we merge or delete similar data to prevent duplication. Finally, we ensure that the data are balanced so that the value of each class of the output variable (e.g., the congestion level) is as close as possible to the number of records of the other class. Consequently, some records in small categories may be duplicated or some records in larger categories may be omitted.

Third Step-Feature Selection
This step aims to determine the most important features of the collected and processed data. Several standard methods can be used to determine the most influential or important features, such as choosing all, backward elimination, forward selection, both backward and forward, all possible cases, or hiring an expert in the field.

Fourth Step-Algorithm Selection
This step intends to use appropriate algorithms for classification and clustering. For classification, we use linear regression, logistic regression, decision tree, random forest, boost, naïve Bayes, support vector machine, K-nearest neighbor, neural network, and deep learning methods. For clustering, we use K-means, principal component analysis, and singular value decomposition methods [66].

Fifth Step-Evaluation
In this step, we use various metrics for comparison and evaluation such as the confusion matrix, F-score, accuracy, performance, precision, recall, and so on [66]. After testing the results on new data, we evaluate whether the outperforming model suffers from overfitting or not, i.e., it is possible that the model was over-trained and therefore provided good results on the training data, and the results may change when the model is applied to the real dataset. However, if the model yields weak accuracy and did not train well, we may face a scenario of under-fitting. Therefore, we must reconsider the trained data, increase its size, or reconsider the selected features, and so on.

Sixth Step-Visualization
The results are displayed graphically and in the form of different plots, depending on the requirements and services rendered.

Level 4-Support Services
We developed a set of useful applications that exploit the generated knowledge to reduce congestion. These applications include:

1.
A parking reservation application that provides a pre-booking car-parking service and allows inquiries about the status of a parking space.

2.
A driver awareness application that helps commuters to enhance their awareness about the traffic rules and the status of the streets.

3.
An LBS-based application for searching points of interest.

4.
An application that supports public transportation for schools, organizations, and government institutions to solve the problem of congestion.

5.
An application that exploits smart signals to enables medical emergency vehicles such as ambulances to arrive their destination without any delays due to traffic congestion. 6.
An application that exploits sensors to measure pollution and noise levels in cities to provide smarter services. Figure 4 shows the technologies that were used at each level of the functional model. We used a wireless network to link these devices. At the traffic light, the cameras send the images to fog nodes via a Wi-Fi network. The sensors and radio identifiers use two types of protocols (Bluetooth and Zigbee) to send their data to the side-unit or default gateway, which forwards that to a fog node via Wi-Fi. The fog node communicates with the cloud to send its data periodically using an internet connection.

Second Level-Main Streets
The level of congestion discovered through several different methods on roads is used to alert commuters to choose other routes. These methods include: • Drones (model number 4drc F11 Pro Com Câmera 4k Ptz) that capture and send images to the nearest station to be analyzed for the detection of any traffic congestion, shown as object 4 in Figure 4. The algorithm for processing these images has been published in our previous work [65]. • The proposed algorithm for analyzing tweets, shown as object 5 in Figure 4.
• APIs of service providers such as Google Maps or OpenStreetMap that return information about the traffic state of a specific location, shown as object 6 in Figure 4. • A mobile application developed to enable users to collaborate by sending and confirming warnings about the traffic state on streets, shown as object 7 in Figure 4.

Third Level-Cloud Level
The data from the first and second levels are collected for a period of time, and then analyzed to discover the behavior of traffic on roads at the city level, in order to support the decision-maker in taking special actions for city planning using Microsoft Azure.

Fourth Level-Applications and Support Services
They have an indirect role in solving the congestion problem. The services provided include: • A Smart Parking App to reserve parking in advance to avoid the search for stopping places, which otherwise results in congestion.

•
The use of public transportation for school students or employees, instead of private cars, to avoid congestion at peak times. In addition, modifying the attendance times of workplaces and schools can contribute to reduce congestion.

Experiment and Results
In this section, we discuss the evaluation of the proposed framework. To do so, we evaluate each level of the functional model of the framework.

Evaluation of Level 1
In this level, we proposed an algorithm to improve the efficiency of traffic lights at intersections. To evaluate our proposed algorithm for a smart traffic light (STL), we compare our work with ITMS [67]. ITMS also aims to improve the efficiency of a traffic light at an intersection by using differential time-slots for congested streets. Since STL also uses a similar approach, in our experiments we compared both algorithms (i.e., ITMS and STL) with a traditional traffic light (i.e., a fixed time-slot for each street). The main difference between STL and ITMS is that STL periodically (after every 10 s) checks the congestion rate on each street of the intersection, whereas ITMS checks only once and then opens the street with the maximum number of vehicles to pass all the vehicles. However, in both algorithms there is a maximum waiting limit for other signals that should not be exceeded.
We used MATLAB for simulations to evaluate the algorithms (STL, ITMS and traditional) for the average waiting time and number of vehicles serviced for each street. We used same parameters during simulation as were used in the ITMS experiments. These were as follows: • The total evaluation time was 1 h (3600 s). • The number of streets at the intersection was 4. • The inter-arrival time for vehicles on the first street ranged between 1 and 20 s per vehicle. This implies that the traffic congestion on the road decreases with the increase in this value. • The inter-arrival time for vehicles on the second, third, and fourth streets was fixed at 30 s per vehicle. This implies that these streets were not congested. • The time to get ready for an orange light was 2 s.

-
The maximum waiting time for any signal was 200 s.

-
The maximum opening time was 60 s (Green light).

-
The minimum opening time on the green light was 5 s (in case all cars had passed).
• The street without vehicles did not open at all. Figure 5 shows the results of our evaluation in terms of the number of vehicles serviced by STL, ITMS, and the traditional traffic light. When the inter-arrival time was small, congestion was more likely to happen. On the contrary, when the inter-arrival time was very large, congestion was less likely to happen. In Figure 5, it is evident that when the inter-arrival time was less than 10 s, the number of vehicles serviced by the proposed algorithm and ITMS was higher than that of the traditional method. In fact, for an inter-arrival time of 1 s, the number of vehicles serviced by the traditional method was less than half of that of the vehicles serviced by the proposed algorithm and ITMS. In the same figure, it is evident that as the inter-arrival time increased, the gap between them (STL/ITMS and traditional) in terms of the number of vehicles serviced became smaller. In fact, when the inter-arrival time reached around 20 s, all algorithms serviced the same number of vehicles. This implies that the proposed algorithm serviced a large number of vehicles compared to the traditional method in a high-congestion environment. ITMS also serviced a large number of vehicles compared to the traditional method in high-congestion scenarios. However, comparing ITMS and STL, the number of vehicles serviced by them was the same for all the inter-arrival times. This is expected in a scenario where there is only one street that is always congested. However, if there is no congestion, the performance of both systems tends to be close to the performance of the traditional one, as shown in the Figure 5.  Figure 6 shows the results of our evaluation of the average waiting time when one street is congested. It is evident from the figure that the average waiting time for the vehicles changed according to changes in the inter-arrival time of vehicles. STL and ITMS yielded similar results for all the inter-arrival times. Compared to the traditional traffic light, these algorithms yielded better results when the inter-arrival time was greater than 2 s and less than 8 s. This implies that when the street is highly congested, STL and ITMS significantly reduce the average waiting times of vehicles.

Average Waiting Time
However, when two streets were congested, our experimental results demonstrate that STL performed better than ITMS when the inter-arrival time on both streets was less than 7 s, as shown in Figure 7. This implies that when two streets are highly congested, STL not only outperforms the traditional traffic light but also ITMS.  The results shown in Figures 5-7 clearly indicate that STL performed better than the traditional method in terms of the number of vehicles serviced and the average waiting times for vehicles when the streets were highly congested. These results also indicate that STL performed very closely to ITMS in terms of the number of vehicles serviced and the average waiting times of vehicles when a single street was highly congested. However, STL outperformed ITMS in reducing the average waiting times for vehicles when two streets were highly congested. This is because ITMS keeps the street open until all the cars have passed, which affects the congestion of the other street. In contrast, our algorithm distributes the time equally between the two streets. In other words, a more fair distribution of timing between the two streets takes place in our algorithm, which is the reason for its better performance in reducing the average waiting time.

Evaluation of Level-2
In this level, we proposed two natural language processing (NLP) algorithms to classify tweets based on their relevance to to traffic congestion and detect the presence of traffic congestion from the identified tweets, with the possibility of determining the cause. We evaluated the accuracy of the proposed tweet classification and detection algorithms. We also evaluated the accuracy of the image processing algorithm proposed in our previous work [65].

Accuracy of Tweet Classification
We implemented the proposed algorithm using the Python language and evaluated it on datasets collected from Twitter. Six datasets built using TF-IDF as a criterion for their importance were used to experiment with 12 queries (12 stories). Based on the number of important words belonging to the vehicular traffic in each tweet, TP, TN, FN, and FP were calculated. Finally, accuracy, precision, recall, and F-scores were calculated. The results shown in Figure 8 indicate that an average accuracy of nearly 97% was achieved by the proposed algorithm. This high accuracy is the result of a simple matching process with words belonging to the vehicular traffic.

Accuracy of Images Processed
The accuracy of the processing of images captured by drones was proposed in our previous work [65]. We processed the images received from the drones using this algorithm to detect the approximate number of vehicles from any given image. Figure 9 shows the accuracy of the algorithm designed to detect the level of congestion from the images captured using drones. The results indicate that the number of vehicles detected by the algorithms was close to the actual number of vehicles in all the experiment runs, yielding an average accuracy of 95%.  Figure 9. Evaluation of the accuracy of the processing of images captured using drones for congestion detection.

Evaluation of Level-3
This level requires data to be collected over a period of time to train and use machine learning algorithms for the knowledge discovery process to facilitate city planning in relation to traffic flow. This study is part of a research project, and we are still in process of collecting data and designing algorithms for this level. Therefore, we were not yet able to evaluate this level in our experiments.

Validation of Level-4
In this level, we proposed a smartphone application to enable users to share information about the street status and receive alerts about traffic congestion on a street that is part of their route. Screenshots of the application are shown in Figure 10. We validated the working of the application in terms of its supported functionalities.

Conclusions
In this paper, we presented a comprehensive framework that acquires the trafficstatus of streets from multiple sources to ensure reliability, supports the integration of different technologies for flexibility, and provides algorithms for efficiency for the purpose of smart traffic management. We segregated and mitigated traffic congestion problems at different levels within the framework. We demonstrated that at each level various technologies can be used for addressing the issues relevant to that level. We proposed a smart-traffic-light algorithm in level 1 for the efficient management of congestion at intersections, tweet-classification and image-processing algorithms in level 2 for reliable and accurate decision support, and support services at level 4 of the functional model. We also evaluated the proposed smart-traffic-light algorithm in terms of its efficiency, and the tweet classification and image-processing algorithms for their accuracy. The results show that STL can minimize the average waiting times of vehicles at an intersection when two streets are crowded, and the use of social media and drones to detect traffic congestion can increase the reliability and accuracy of the STL. In our future work, we aim to implement and evaluate the third level of the framework. We also aim to investigate other challenges of the framework, such cost-benefit analyses, the safety and security of infrastructure, and the privacy of users' data.