A Review of Different Components of the Intelligent Traffic Management System (ITMS)

Nigam, Nikhil; Singh, Dhirendra Pratap; Choudhary, Jaytrilok

doi:10.3390/sym15030583

Open AccessReview

A Review of Different Components of the Intelligent Traffic Management System (ITMS)

by

Nikhil Nigam

^*,

Dhirendra Pratap Singh

and

Jaytrilok Choudhary

Computer Science & Engineering Department, Maulana Azad National Institute of Technology, Bhopal 462003, Madhya Pradesh, India

^*

Author to whom correspondence should be addressed.

Symmetry 2023, 15(3), 583; https://doi.org/10.3390/sym15030583

Submission received: 30 January 2023 / Revised: 9 February 2023 / Accepted: 20 February 2023 / Published: 23 February 2023

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

:

Traffic congestion is a serious challenge in urban areas. So, to address this challenge, the intelligent traffic management system (ITMS) is used to manage traffic on road networks. Managing traffic helps to focus on environmental impacts as well as emergency situations. However, the ITMS system has many challenges in analyzing scenes of complex traffic. New technologies such as computer vision (CV) and artificial intelligence (AI) are being used to solve these challenges. As a result, these technologies have made a distinct identity in the surveillance industry, particularly when it comes to keeping a constant eye on traffic scenes. There are many vehicle attributes and existing approaches that are being used in the development of ITMS, along with imaging technologies. In this paper, we reviewed the ITMS-based components that describe existing imaging technologies and existing approaches on the basis of their need for developing ITMS. The first component describes the traffic scene and imaging technologies. The second component talks about vehicle attributes and their utilization in existing vehicle-based approaches. The third component explains the vehicle’s behavior on the basis of the second component’s outcome. The fourth component explains how traffic-related applications can assist in the management and monitoring of traffic flow, as well as in the reduction of congestion and the enhancement of road safety. The fifth component describes the different types of ITMS applications. The sixth component discusses the existing methods of traffic signal control systems (TSCSs). Aside from these components, we also discuss existing vehicle-related tools such as simulators that work to create realistic traffic scenes. In the last section named discussion, we discuss the future development of ITMS and draw some conclusions. The main objective of this paper is to discuss the possible solutions to different problems during the development of ITMS in one place, with the help of components that would play an important role for an ITMS developer to achieve the goal of developing efficient ITMS.

Keywords:

intelligent traffic management system (ITMS); vehicle detection; vehicle tracking; traffic signal control systems (TSCSs); simulators

1. Introduction

The rapid speed at which urban growth is proceeding is the primary cause of the increasing traffic congestion on city roads. Because of this, vehicles can be standing for a long time. Long-term standing affects the environment in the form of vehicle pollution, which causes human health issues related to breathing and delays in emergency situations such as accidents that may cause death. Stopping development to reduce traffic congestion may not be the solution; there are many other factors, apart from development, that contribute to traffic congestion. One of the factors is the increased number of vehicles, which can be worked on. So, it is very important to develop an intelligent system that can be used to reduce traffic congestion by addressing the number of vehicles. Nowadays, various types of technologies for advancement are being developed. These include the Internet of Things (IoT), machine learning (ML), microcontrollers, wireless sensor networks (WSNs), and fuzzy logic (FL), which are used to better control traffic in complex situations. ITMS has many applications, some of which are environmental impact assessment, electronic toll collection, anomaly detection, illegal activity identification, security monitoring, and traffic signal management systems. These ITMS applications are slowly becoming a necessary part of human life and are being used to effectively improve human quality of life issues. There are many challenges, some of which are discussed in Table 1. Knowledge of these challenges helps in the development of robust ITMS. The names of the challenges are listed in Table 1.

This paper discusses major sic components. These components are named (I) image acquisition, (II) utilization of static and dynamic attributes, (III) vehicle behavioral understanding, (IV) traffic software applications in ITMS (V) ITMS applications, and (VI) TSCSs. Figure 1 depicts the flow of the component and Figure 2 depicts taxonomy of the article.

Image acquisition: A digital image acquisition represents features of the vehicle in the form of pixels, which are extracted from a real-time scene. The main aspect of this component is to deal with the captured vehicles before evaluating them. This component describes how to obtain traffic conditions and from where it is obtained.
Utilization of static and dynamic attributes: This component is responsible for extracting both static and dynamic vehicle attributes after receiving the captured scene in real-time. The speed of the vehicle, its trajectory, and its direction are examples of the vehicle’s dynamic attributes as they change over time. Static attributes of a vehicle include features such as the logo, color, type, license plate number, and other details that remain constant throughout time. This component explains how to use extracted attributes for developing ITMS.
Vehicle behavioral understanding: This component is responsible for providing an understanding of internal and external factors in vehicles via the use of static and dynamic attributes obtained from cameras placed at various points along the road network.
Traffic software applications in ITMS: In the field of ITMS, traffic software applications typically make use of cutting-edge technologies such as global positioning systems (GPS), traffic sensors, and real-time traffic data in order to collect and analyze information regarding the traffic situation on the roads. They are able to offer drivers real-time traffic updates, propose alternate routes, and assist in the management of emergencies such as accidents and road closures based on the information that they have access to. In summary, traffic software applications that are a part of ITMS are an essential component of modern transportation systems. This is because they help to improve the overall performance and sustainability of road transportation, in addition to enhancing the safety and convenience of drivers and passengers.
ITMS applications: This component provides a brief description of the many ITMS-related applications, including electronic toll collection, environmental impact evaluation, TSCSs, security monitoring, anomaly detection, and illegal activity identification.
TSCSs: This component provides a detailed description of the types of TSCSs. These are used to obtain the optimized traffic signal at the intersection, which helps reduce the average waiting time as well as other negative factors such as pollution and noise.

The remaining article is divided into nine sections. The second section provides an explanation of the image capture of scenes as well as the imaging technologies used for ITMS. The third section discusses the characteristics of vehicles, both static and dynamic, in order to provide information about the vehicle that is used to obtain a better understanding of ITMS behavior. The fourth section discusses how vehicles behave once they have been extracted. The fifth section covers the real-time applications used in ITMS. The sixth section covers applications of ITMS. The seventh section addresses the issue of reducing traffic congestion, delays, and accidents by implementing traffic signal control systems at intersections. The eighth section discusses all types of simulators that help create a real-time environment for analyzing methods based on traffic. The ninth section discusses the areas where the researcher can work to develop ITMS. Finally, the tenth section describes the conclusion of the article, in which we make our closing remarks.

2. Image Acquisition Component

Image acquisition is divided into two parts: the first part is traffic scene regions for image acquisition, which discusses the various types of areas from which an image can be taken to monitor traffic; the second part is imaging technologies, which discusses the various types of technologies that can help in capturing traffic scenes along with performing many tasks such as vehicle detection, vehicle tracking, etc.

2.1. Traffic Scene Regions for Image Acquisition

ITMS is primarily used in the management of traffic in four distinct regions of traffic scenes by using imaging technology. The regions of the traffic scene are mentioned below.

City Tunnel: A city tunnel is a completely enclosed underground passageway with entry and exit points at either end. The darkness that surrounds the city tunnel during the day is similar to the darkness during the night. Therefore, during the ITMS development, there are lighting challenges in the city tunnel. This is solved by providing extra light so that the ITMS system can work perfectly.
Highway: A highway is a multi-lane road. On an urban highway, vehicles move fast, so they appear for a short period in the camera’s field of view. Therefore, imaging technologies should be advanced so that they are able to capture the scene at a fast frame rate, which can be used for vehicle detection, vehicle classification, and vehicle tracking. The vehicle speed capturing challenge is addressed by supplying a high frame rate video camera similar to that used in ITMS.
Road intersection: An intersection is a point at which two or more roads intersect. At the road intersection, vehicles usually move left, right, and round. As a result, plenty of poses occur, making it challenging to identify and track them. There are many vehicle detectors and methods, such as “You Only Look Once” (YOLO) [1] and the Kalman filter [2], which are used at the intersection to detect multiple poses as well as the occluded vehicles to calculate the traffic signal accurately. The main role of a traffic signal is to provide efficient timing for vehicles in each lane. It can be both dynamic and fixed in nature. The dynamic traffic signal provides the time for passing vehicles from an intersection on the basis of the number of vehicles, and the fixed signal provides the time on the basis of historical data that is related to the number of vehicles moving at a particular intersection.
Road section: Vehicles moving at a slower rate on the section of the road during peak hours, and in times of heavy traffic congestion, may even come to a complete stop. This results in a long vehicle queue. Therefore, an appropriate lane should be planned with a powerful ITMS along with a rerouting approach to avoid long queues.

2.2. Imaging Technologies

This section explains various imaging technologies that help to collect data from traffic scenes and communicate the obtained data from the traffic scenes to the approved authorities who manage the traffic conditions by better analyzing it.

In recent years, advancements in imaging technologies have increased the visual quality of captured traffic scenes. As a result, vehicles and other objects are detected more accurately for further analysis. Image sensors are a primary part of developing vision-based surveillance systems for ITMS. Image sensor technologies make use of features of vehicles such as their color, edge, tracklets, and texture in order to detect, track, classify, and identify violations. The following are some examples of image technologies:

Sony image sensor: There are many Sony image sensors available. One of these image sensors based on large-format CMOS technology with a global shutter [3] is used in cameras for powerfully and efficiently detecting objects with high image quality using 127.68 megapixels. It allows for maintaining frame rates of 12.9 frames per second with 14 bits, 19.6 frames per second with 12 bits, and 21.8 frames per second with 10 bits. These sensors are ideal for use in applications such as automotive, security, surveillance, and manufacturing.
Python XK generation of CMOS image sensors: It [4] has high bandwidth, due to which it provides a high-resolution image. It supports both pipelined and triggered global shutter readout modes and has an 80 FPS frame rate when the megapixel is 25. Because of this, it is used for quicker vehicle recording and a more accurate evaluation of vehicle motion, such as how slowly or quickly it is moving.
Hikvision imaging technology: It [5] is used to address a wide range of needs that are related to surveillance, ranging from video monitoring to video content analytics. As a result of this, it is utilized mostly in vehicle detection and incident management solutions. Due to the advancement of the IoT and video surveillance systems, both are widely used in this technology, making it more advanced.
Citilog company imaging technology: It [6] offers intelligent video-based solutions for increasing the efficiency of traffic flow, managing incidents, and analyzing traffic data. It also provides tools, such as traffic optimization tools and incident management tools, for areas of interest such as smart cities, highways and expressways, tunnels and bridges, and elevated roads.

3. Utilization of Static and Dynamic Attributes

This section describes dynamic and static attributes along with information on how they are being used to help solve traffic-related issues. This section consists of three different approaches: vehicle detection, vehicle tracking, and vehicle recognition, where the attributes are used.

3.1. Vehicle Detection

The detection of vehicles is an important step in the ITMS system. If vehicle detection is absent in ITMS, it would be unable to operate effectively in speed measurement, vehicle counting, forecasting of traffic flow, and vehicle classification. An efficient vehicle detection system is one that is able to detect vehicles, even those that are obscured by obstacles such as bridges, trees, and other objects. In order to gather traffic data for the purpose of effectively detecting vehicles, many methods of vehicle detection and sensors are being used. Accurate vehicle detection is essential for behavior analysis and vehicle tracking, along with the scheduling of traffic signals at intersections.

The detection of vehicles is classified into two distinct categories based on detection approaches, which are as follows:

Detection of vehicles based on appearance.
Detection of vehicles based on motion.

3.1.1. Detection of Vehicles Based on Appearance

Some of the features of a vehicle, such as its color, texture, and shape, are examined in order to determine its detection. Because these features are easily visible, they are referred to as appearance-based features. Here, we discuss different techniques that use these features. These techniques are classified as feature descriptors, classifiers, and 3-D modeling.

Feature descriptors: First, the appearance of the vehicle is defined with the help of many different feature descriptors that provide a description of the features by creating a feature vector. These feature descriptors are local image patches, the edge histogram descriptor (EHD), scale-invariant feature transformations (SIFT), the histogram of oriented gradients (HOG), Haar-like features, and local binary patterns (LBP).

The local image patches are collections of pixels in an image. Patches that have a rectangular form hold information about the boundaries required to define the characteristics of the objects [7]. The local image patch feature descriptor is a simple and effective method for encoding the feature vector as the values of patch pixels. This representation, on the other hand, is very sensitive to changes in lighting as well as vehicle size. In spite of this, a local patch is used for accurate classification and regression at nighttime with the help of a nighttime vehicle detection method that integrates the attentive generative adversarial network [8]. Recently, an inventive and extremely effective method for the detection of vehicles in foggy scenes, based on the utilization of image patches with the Swin Transformer, has been developed [9].

EHDs are used to achieve a higher level of spatial invariance as a means of mitigating the effects of lighting conditions as a direct result of local patches that are particularly sensitive to variations in illumination as well as vehicle size. Basically, the edge histogram feature indicates the direction of edges in an image based on brightness changes. In order to detect vehicles for the purpose of tracking them, an edge histogram is utilized for edge processing, and a fixed threshold is applied [10]. In many applications, several feature extractors are involved in the generation of a high-dimensional model. As a result of this, the functioning of the application does not give a proper and good result. To address this problem, EHD and the Color Layout Descriptor [11] were developed as feature descriptors.

One more very popular local feature descriptor is SIFT [12]. This algorithm is invariant to image scaling and rotation, changes in illumination, and affine projection. In other words, the capability of making modifications has no impact on the functioning of scale-invariant feature transformations. Earlier surveillance systems faced difficulties such as limited image quality and size, as well as significant intraclass variation. To address these issues, Ma et al. [13] explained a modified version of the SIFT descriptor along with a repeatable and discriminative feature based on edge points. It provides a comprehensive representation of vehicle images for reliable identification in congested environments in under two seconds. However, the SIFT feature matching algorithm has some issues with accuracy if it uses Euclidian distance. To improve the accuracy of feature matching, Zhu et al. [14] proposed an improved version of the SIFT algorithm that uses the linear-combination of city block distance and chessboard distance instead of Euclidian distance. SIFT is being used with deep learning. Djenouri et al. [15] proposed a novel deep-learning-based cleaning algorithm in which the SIFT is used as an extractor to remove groups of outlier images from the gathered vehicle frames in order to detect vehicles and prevent accidents.

Another feature descriptor is HOG, which counts the frequency of gradient orientation occurrences in defined image regions to assist with vehicle detection. HOG and classifiers improve vehicle detection performance. Lee et al. [16] proposed a method where the support vector machine (SVM) classifier was combined with HOG to accurately detect vehicles. HOG is also used to validate the vehicle that is included or not in the generated hypotheses by extracting features. Ali et al. [17] proposed a system with hypothesis generation and hypothesis verification steps that use HOG, SVM, and a decision tree to detect vehicles. It was observed that there are difficulties detecting vehicles in bad weather. By addressing this issue, Wang et al. [18] proposed a system based on pseudo-visual search and HOG and LBP feature descriptors to detect vehicles with high accuracy. Here, HOG-LBP fusion is used for classification by training the vehicle classifier.

The Haar-like feature descriptor is the next feature descriptor. The Haar-like characteristics descriptor essentially aids real-time vehicle detection applications. When it is combined with a neural network such as artificial neural networks (ANNs) [19], the performance of detection becomes high. The Haar-like feature [20] also facilitates the rapid generation of hypotheses for detecting vehicles with the help of an algorithm known as “gentle adaptive boosting”. It is shown that the Haar feature produces hypotheses relatively quickly, but it has the potential to identify false vehicle candidates. To filter out false vehicle detection, the histogram of oriented gradient features is used to train the SVM algorithm. Arunmozhi et al. [21] performed a comparison analysis between HOG, LBP, and Haar-like features and found that HOG features perform better than the other two different features, with a higher detection rate for the similar dataset.

2.: Classifier: The classifier is used to identify vehicles based on their features and appearance. An algorithm that transforms input features into a certain category is referred to as a classifier. Depending on their purpose, classifiers are divided into two categories: discriminative and generative. Some models use discriminative classifiers to create data space boundaries, whereas other models use generative classifiers to represent how data are distributed across the space.

Discriminative classifiers analyze data in order to determine which aspects of the input data are the most significant for classifying objects into distinct categories. It can be accomplished by developing class decision boundaries and learning posterior classification probability, which are applied in the vehicle detection process. Different discriminative classifiers such as boosting, SVM, and deep neural networks (DNNs) are used for vehicle detection.

The SVM is a discriminative classifier that is used in the solving of problems of classification and regression. Basically, SVM makes an effort to locate the “best” margin that divides the classes, and this lowers the risk of error in the data. Another significant advantage of SVM is that they have a much smaller number of mutable parameters, which are frequently used for vehicle detection. Chen et al. [22] proposed a method for tracking and classifying vehicles using roadside CCTV. This was accomplished by first training an SVM with intensity-based pyramid HOG features generated after background elimination on a sample of vehicle silhouettes, and then identifying foreground blobs using majority voting. The conventional approach to ML included manually extracting features and then passing those features into a variety of classifiers, such as HOG plus SVM [23] or Deformable Part Model (DPM) plus SVM [24], in order to carry out classification. The performance might vary depending on the environment and the features being used. So, picking the features that are most suited to characterize the objects is a very difficult task. Deep learning, which is based on convolutional neural networks (CNN), provides a one-of-a-kind solution to this issue since it automatically and optimally extracts features via learning. By using the advantages of CNN and SVM, Karungaru et al. [25] proposed a model that increases the accuracy of the network even more by lowering the likelihood of the network being overfitted, boosting the model’s capacity for generalization, and improving the model’s ability to generalize.

Boosting the discriminative classifier enhances an ensemble learning approach to reduce the number of errors committed during training and achieve high accuracy. The fundamental strategy is to repeatedly run the weak learning algorithm on various distributions of examples in order to produce different hypotheses. This method helps reduce the high bias that is characteristic of ML models. Using this strategy, Y. Freund [26] proposed a method for enhancing the efficacy of binary concept learning systems. Adaptive boosting is a method for ensemble learning that has been widely used in applications for categorization. When deciding which features are most useful, traditional adaptive boosting algorithms often ignore sample weights and the fact that weak classifiers frequently exhibit inconsistent performance across categories. The weighted feature selection and category classification confidence-based adaptive boosting algorithm is suggested in [27] as a result of this reasoning.A novel ensemble approach has been proposed by Chen et al. [28], which emphasizes the correlation between multiple learning algorithms and variable data distributions, in contrast to the conventional majority voting technique that is typically used to enhance prediction stability. Unlike bagging, boosting, and random-forest algorithms, which rely on weak learning algorithms to improve prediction accuracy, this new approach aims to achieve superior performance. The next generation of autonomous vehicle detection system has two key components: quick processing and precise detection. To achieve the goals of autonomous driving, Haar-like features and a histogram of oriented gradient features are organized. Based on both features, Alam et al. [20] proposed a new computer vision-based, cost-effective vehicle detection system that uses a gentle adaptive boosting algorithm. This system has been trained to construct the vehicle hypothesis using Haar-like features. Despite providing hypotheses quickly, the Haar-like feature may be able to identify fake vehicle candidates. In order to eliminate the false hypothesis produced in this system, the SVM approach is trained using the histogram of oriented gradient features.

Recent research has shown that techniques based on deep learning are superior to those that were used in the past, especially for CV and scene understanding tasks [29]. Deep learning-based approaches produced greater feature representation than hand-crafted features and lower processing times than the sliding window-based methods [30]. This was accomplished through the use of CNNs. Object detectors that are based on CNN may often be broken down into one of two categories: two-step and one-step. Two-step detectors, including Regions with Convolutional Neural Networks (R-CNNs) [31], Fast Regions with Convolutional Neural Networks (Fast R-CNNs) [32], Faster Regions with Convolutional Neural Networks (Faster R-CNN) [33], and Mask Regions with Convolutional Neural Networks (Mask R-CNN) [34], make use of region suggestions in order to finish the object location regression and classification procedures in two steps. On the other hand, one-step detectors predict object positions and classes concurrently in a single network. Examples of such detectors are YOLO v3 [35] and the single-shot multibox detector [36]. Therefore, detectors with just one step may have a quicker detection rate than detectors with two steps [37]. For analysis of the most recent developments in the field of deep learning algorithms for the identification of vehicles, Wang et al. [38] presented a comparative study where, on the KITTI data, deep learning algorithms such as faster R-CNN, region-based fully convolutional networks (R-FCNs), single-shot detectors (SSDs), Retina Net, and YOLOv3 were applied. 3D depth maps also play a significant role in the identification of vehicles. So, they are becoming more popular. Javadi et al. [39] proposed a method based on 3D depth maps along with DNNs to detect vehicles.

The next type of classifier is called the generative classifier. It is utilized for the learning model that creates the data as well as determines the class of a new observation when one is provided. The Markov Random Field (MRF) and the Gaussian Mixture Model (GMM) are both popular types of generative classifiers. Other types of generative classifiers include part-based models (DPMs), hidden Markov models (HMMs), active basis models (ABMs), and so on. These classifiers are used to manage crucial strategies for monitoring and managing traffic, such as detection and tracking, respectively.

MRFs have the ability to describe independence assumptions in a compact manner, whereas directed models are unable to do so. Because of their capacity to combine neighboring information and make local decisions, MRFs have found widespread application in the field of image processing, namely for the purposes of denoising, restoring, and segmenting images. Accurate tracking can be completed even when there are obstacles and traffic jams. In the field of object recognition, it is observed that the techniques based on HOG have previously established their superiority. However, edge-based detection approaches (like HOG) may produce a high number of false alarms when the object is relatively small against a complex background, such as an aerial view of a vehicle in images from an unmanned aerial system. In order to solve this problem, Madhogaria et al. [40] combined HOG and SVM in a pipeline technique, where the first step seeks to improve the discriminative classifier’s decision parameters to obtain high recall rates with a relatively high false detection rate. The majority of the false positives are intended to be eliminated by the second step (causal MRF). D. Ashwini [41] presented a model based on an extended version of MRF named the Spatial Temporal Markov Random Field (ST-MRF) model that works inside a compressed region and tracks moving vectors (MVs) and blocks coding modes (BCMs) originating from a compressed bitstream. It employs the ST-MRF model concept for enhancing the precision of target acquisition and continuous tracking.

A hidden Markov model, often called an HMM, is a kind of generative classifier model in which the distribution that produces an observation is dependent on the state of an underlying Markov process that is not being seen. An HMM is used for the detection and counting of vehicles. By using the features of HMM in vehicle detection, Yin et al. [42] proposed a technique in which they first extracted features from input images using Principle Component Analysis and Multiple Discriminant Analysis and then utilized HMM to classify each image into one of three categories (road, head, or body), where categories are referred to as states. Then, using the collected state sequences, they detect vehicles. Accurate traffic measurements, such as the number of vehicles using lanes, are needed to ease traffic congestion and enhance traffic safety. Unfortunately, the majority of the current infrastructure, including loop detectors and several video detectors, is not practical for providing precise vehicle counts. As a result, Miller et al. [43] suggested a unique approach for detecting and counting vehicles by using HMM. Tao et al. [44] proposed an automatic smoke vehicle identification approach that is based on multi-feature fusion and HMM. Here, HMMs are utilized in order to characterize and categorize the smoke-colored block sequences as well as the area sequences that are found in continuous frames.

A GMM uses a probabilistic approach to represent normally distributed subpopulations that are contained inside a larger population. GMMs were used in [45] to automatically detect vehicles based on local characteristics contained within the bounds of the three primary image subregions. Indrabayu et al. [2] also proposed a method based on the GMM and the Kalman filter that was utilized for the purposes of vehicle detection and tracking, where the GMM method was used for vehicle detection and the Kalman filter method was used for vehicle tracking. GMM is not perfect when there are fluctuations in the lighting, background clutter, occlusion, and so on. The identification of vehicles is still a significant obstacle in the field of computer vision. To address these issues, P Jagannathan et al. [46] proposed a method based on the GMM and the ensemble deep learning technique that is utilized for the purposes of the detection and classification of moving vehicles. One of the difficult problems associated with complicated urban traffic monitoring during the day and at night is the detection of vehicles in a way that is both efficient and accurate. In consideration of this, Song [47] presented a new method for detecting vehicles that makes use of spatial relationship modeling (GMM) during the day and at night and is based on a camera with a high resolution. For the purpose of vehicle detection in complex urban traffic scenes, the GMM with confidence measurement [48] has been proposed as a potential solution to the issue that the subtraction background model is prone to being easily contaminated by vehicles that are either moving at a slow speed or have temporarily stopped moving.

ABM is a model for detecting and identifying objects that are comprised of a limited number of Gabor wavelet elements positioned in predetermined places and orientations. Yao et al. [49] employed the deformable ABMs from a series of training images to identify vehicles in novel images by template matching. Throughout the process of learning, detection, and categorization, they used a guiding framework based on the active basis model [50]. They adjusted the guiding framework by using the edge and color symmetry properties of the vehicle in the template matching algorithms so that they could increase the performance of vehicle recognition.

Part-Based Model: This is a model that is utilized in applications relating to vehicles and is based on the concept of breaking an item down into its component pieces and analyzing the spatial connections between those parts. Recognizing vehicles at a finer granularity level is difficult due to the large number of subclasses and the small distance between each class. As a method for completing this challenge, Zhou et al. [51] recommend using a part-based model for fine-grained vehicle identification in a weakly unsupervised manner by employing saliency maps, which are simple to create in a single pass of backpropagation and can be used to locate the areas of the image that are discriminative. The majority of systems only attempt to identify vehicles that are visible in a single view, and their effectiveness is readily hindered by partial obstruction. A newly designed multi-view vehicle detection system [52] has been developed by making use of a part model to solve the issue of partial occlusion as well as the high variance that exists between different kinds of vehicles.

3.: 3-D modeling: Recently, new technologies such as faster R-CNN have led to the remarkable growth of 2D object identification, which has resulted in the development of numerous new techniques and the competition of thousands of ideas. However, standard 2D detection cannot give all the information required to feel the surroundings in the application scenarios of drones, robots, and augmented reality. The technology for 3D target identification is currently undergoing rapid development, and the major tools used are the monocular camera, the binocular camera, and the multi-line LIDAR. Apart from tools, there are also substitutes, such as 3-D modeling. Many researchers have worked on the development of vehicle identification, which is essential for safety in an autonomous driving system by giving drivers or an intelligent control system reliable relative location and speed information to prevent collisions. More specific information about the objects, such as their size, position in relation to other objects, and direction, can be revealed by 3D object detection algorithms. For intelligent driving features such as behavior analysis, route planning, navigational control, and collision avoidance, these data are essential. 3D object detection techniques are divided into three categories: approaches based on images, point clouds, and fusion.

Image-based approaches perform 2D detection on the image plane before extrapolating the results to 3D space using bounding boxes, regression, or reprojection restrictions. Many researchers developed their models based on the image-based method. CNNs have been proposed by Chen et al. [53] that use context and depth information to jointly regress to 3D bounding box coordinates. For many-task vehicle analysis from a given image, Chabot et al. [54] introduced a novel method named Deep MANTA (Deep Many-Tasks). In order to simultaneously recognize vehicles, locate parts, characterize visibility, and estimate 3D dimensions, a reliable convolutional network is introduced.

The point-cloud-based approaches that have been developed so far can be divided into three subcategories: projection-based, voxel-based representation, and raw point cloud techniques. Objects are typically detected using projection-based approaches by first projecting a raw point cloud into fictitious images and then utilizing a 2D detection framework with an extension to regress 3D bounding boxes. By describing a complicated regression technique that will produce a 3D bounding box regression and an estimate of the object’s orientation, Simon et al. [55] suggested a specific Euler-Region-Proposal Network estimate the pose of the object by adding a fictitious and a real fraction to the regression network. These two fractions are added to the state-of-the-art real-time 3D object detection network known as complex-YOLO, which is designed exclusively for use with point clouds. In voxel representation techniques, the raw point cloud is typically divided into various voxels of varying sizes. It gathers features from each voxel to create a pseudo image, then uses a 2D convolution backbone and a 3D detection head to output the detection results. Some approaches conduct a 3D convolution operation to detect objects directly in 3D space. The point clouds are represented as a binary volumetric in a 3D fully convolutional network [56], which is then fed into a 3D fully convolutional network to predict the 3D bounding boxes and the orientation estimation of the vehicles. The majority of researchers convert these kinds of data into standard three-dimensional voxel grids or collections of images. However, because of this, data ends up being needlessly bulky, which in turn leads to problems. To address the issue, raw point cloud techniques are developed that take the raw point cloud as their input and forecast the 3D bounding box regression and the orientation estimate of the item. This helps to ensure that no information is lost in the process. By using raw point cloud techniques, Qi et al. [57] developed a novel sort of neural network that directly takes point clouds and that appropriately respects the permutation invariance of points that are included in the input. However, Point Net’s inability to discern fine-grained patterns and generalize to complicated situations is due to the fact that it is not intended to capture local structures created by the metric space in which points live. Using a deep segmentation of the input point set, Qi et al. [58] developed a hierarchical neural network that implements Point Net recursively.

Images and point clouds are combined in fusion-based approaches, enabling interaction and complementarity between modalities. The region-proposal network is typically used in architectures to produce trustworthy suggestions from each feature view. Early fusion, late fusion, and deep fusion are the three types of further fusion that are performed on the respective features.

3.1.2. Detection of Vehicles Based on Motion

The installation of video surveillance cameras on highways and road crossings helped to capture events that took place, such as vehicle accidents, traffic jams, near calls, crossing lanes, and unexpected halts. In fields such as computer vision, motion detection is an essential component for identifying moving vehicles against a still background. Temporal frame differencing, background subtraction, and optical flows are the three approaches that are used to identify motion.

The frame differencing method produces different images by subtracting two or three neighboring frames from a time series image. The threshold value is then used to obtain moving target information. This method is straightforward and simple to use, but it is challenging to obtain moving targets’ full contours, and it is simple to produce the “double”, “gap”, and “holes” phenomena inside the target, which results in false target information being identified. By using three-frame differencing, Srivastav et al. [59] proposed methods that have the capability of reducing the hole issue in videos with dynamic backgrounds that are constantly updated. As a result, the item should be identified more accurately when the dynamic background changes. In order to suppress unrelated motion, Saur et al. [60] developed an efficient image differencing technique that makes use of the smallest variations between individual pixels in small neighborhoods. When Keck et al. [61] used a box filter to locally average the difference image and then subtracted the filtered value from the original difference image, they were able to somewhat minimize the number of false positive detections in the aerial photography. However, noise and illumination changes caused multiple false alarms. Because of this, frame-difference techniques are applied to a somewhat varying selection of situations. So, Sommer et al. [62] provided an overview of frame-difference approaches in a slightly varying variety of situations.

The background subtraction technique is the next technique that is based on the motion feature. It is a simplistic strategy that is easy to apply and operates extremely well in real time. During the process of background subtraction, the current frame of the video is subtracted from the background frame that is being referenced for the purpose of extracting foreground objects. Both the background image and the current image are then calculated pixel-for-pixel [63]. When there is fog, background subtraction might be quite sensitive. Siddharth et al. [64] proposed a one-of-a-kind lightweight background subtraction approach that is used to identify the frames that include motion and discard the remaining frames in order to decrease the number of computational costs incurred by cloud customers as well as the amount of space they need to store their data. There are many different types of background subtraction algorithms that can be used, such as pixel-based adaptive segmentation, the Haar cascade classifier, or a Gaussian mixture. Azeez et al. [65] provided an overview of recent developments in background subtraction-based object detection techniques.

The optical flow method is also dependent on motion. The term “optical flow” refers to the rate at which the individual pixels that comprise moving objects in a video accumulate information. In optical flow, there are some assumptions that each pixel from the preceding frames has moved to the same location in the current frame in the image sequence. This indicates that a velocity vector is associated with every pixel in every frame. In a perfect scenario, the background would remain consistent at all times. This indicates that the optical flow of its pixels is zero, and the portion of it that contains pixels whose optical flow is not zero is the moving target that has to be located. The optical flow approach is very effective in locating and evaluating moving objects [66]. However, because of the creation of the shadow in the complicated natural environment, the conventional optical flow approach is unable to precisely determine the border of the moving vehicle. Additionally, typical vehicle tracking algorithms are frequently obscured by obstructions such as trees and buildings, and particle filters are vulnerable to degradation from particles. Sun et al. [67] suggested a method of moving vehicle detection and tracking based on the immune particle filter algorithm and optical flow approach to address this issue.

3.2. Vehicle Tracking

One of the most important and active fields of research in the science of CV is multi-object tracking. It finds considerable application in robotic vision, surveillance systems, and other commercial applications, such as the synthesis of surveillance video synopses. Luo et al. [68] provided details regarding multi-object tracking. A single object has been tracked using probabilistic-based trackers such as the Kalman filter, particle filter, metaheuristic trackers, and deterministic trackers. The Kalman filter and the particle filter are two of the most popular tracking techniques.

The Kalman filter improves the accuracy and reliability of tracking significantly when vehicle motion is blocked by other objects, which can result in tracking failure [69]. Actually, this is a kind of linear quadratic estimation for estimating linear-quadratic functions based on a set of measurements taken throughout time for more efficient and stable tracking. The use of sensors to locate objects and determine distances does not always provide correct results. This is a deficiency, but sensor fusion helps to compensate for it by lowering the error rate. As a result of sensor fusion, Kim et al. [70] developed a strategy that makes use of an Extended Kalman Filter, which takes into account the methodologies that lidar and radar sensors use to compute distance. The distance-dependent properties of the lidar and radar sensors were analyzed, and a reliability function was developed in order to include the distance-dependent properties of the sensors in the Kalman filter. There are a number of issues that often arise during tracking, including the fact that sensors do not always provide correct readings and that the motion model of the target may shift as a result of changes in the environment or interactions with other targets. Using the concept of interactions, Khalkhali et al. [71] developed an approach known as an Interactive Kalman filter.

A particle filter’s structure is based on the Bayesian formulation, which acts as its foundation. A stochastic motion model is utilized in this formulation to estimate the states at the subsequent time occurrence, and samples are iterated through time to maintain various hypotheses. Keeping track of several hypotheses allows the tracker to deal with background clutter, partial and complete occlusions, and recover from failure or momentary distraction. Therefore, particle filters have seen a lot of application in tracking systems due to the fact that they are able to manage non-linear target motion and may be utilized with a variety of object models. A novel and efficient approach to tracking multiple vehicles is proposed by Abdelali et al. [72], which is based on deep learning and particle filtering algorithms. The primary emphasis is on establishing effective connections between vehicles for use in online and real-time applications. Some models achieve accurate tracking of vehicles by using a strategy that incorporates individual parts with a particle filter to deal with occlusion and changes in aspect ratio. The majority of systems in use today only detect vehicles using a bounding box representation and do not provide vehicle position information. For certain real-time applications, such as the motion estimate and trajectory of moving vehicles on the road, the location information is robust, though. The multi-type and multiple vehicles in an input video are detected using the enhanced YOLOv3 and improved visual background extractor techniques that Sudha et al. [73] proposed. Tracking entails locating the trace of approaching vehicles using a Kalman filtering algorithm and particle filtering techniques. They also suggested a technique called “multiple vehicle tracking algorithms” to further improve the tracking outcomes.

3.3. Vehicle Recognition

The process of identifying the types of vehicles that are present on the road is referred to as “vehicle recognition”. The following section discusses the numerous vehicle recognition-based techniques that make use of vehicle color, vehicle logo, vehicle license plate numbers, vehicle shape, and appearance.

3.3.1. Vehicular License Plate Recognition

The framework of vehicular license plate recognition has become an essential method for traffic applications including monitoring of parking lot access, surveillance of vehicles, automatic collection of vehicle tolls, monitoring of road traffic, enforcement of vehicular law, calculation of traffic volume, analysis of vehicle activity, tracking of vehicles, and the pursuit of criminals. This recognition relies on a number of different methods, including vehicular plate detection, character segmentation, and character recognition.

The accuracy of the Vehicle License Plate Recognition system is directly correlated to the performance of the vehicle plate detection step. The existing detection approaches are classified based on attributes such as texture, edge, color, etc. The color of the vehicle’s license plate has been regarded as one of its crucial qualities because different states, provinces, or nations have different standards on what color the license plate should be [74]. The texture of a vehicle’s license plate refers to how the colors of the background and the characters change and evolve over time. Despite the fact that color characteristics are extremely sensitive to illumination and noise, Shemarry et al. [75] proposed an approach for detecting vehicle license plates in low-quality images that uses attributes based on the underlying texture rather than color. It utilizes a large number of adaptive boosting cascades with three steps of pre-processing LBP classifiers. In order to obtain the proper positioning of the vehicle’s registration plate by employing edge features, there are a variety of methods for finding license plates that are used. There are some ways that are based on the Enhanced Prewitt Arithmetic Operator [76], and there are other methods that are based on the Sobel operator [77].

Character segmentation-based techniques locate the locations of the characters in an image to determine where the likely plate area is in the image. This is performed by first searching for characters in the image and then identifying the area that matches those characters as the most likely plate region if one is found. Various ways of segmenting each character have been presented after plate localization. However, some of them have issues with deteriorated vehicle license plates, complex backgrounds, and skewed vehicle license plates. Gao [78] suggested an approach using segmentation to handle the problem of poorly degraded vehicle license plates. Here, the preprocessing approach employed color colocation and dimension, which improved the accuracy of segmentation and localization. Wu et al. [79] proposed a system that is mainly dependent on the completeness of a character. Therefore, it is tolerant of complex backgrounds and skewed plates, and its processing time is enough for various applications. Character segmentation and license plate localization are both important components of the license plate recognition system. License plate localization and character segmentation are based on hybrid binarization techniques and feedback self-learning used in [80] in order to localize Taiwanese vehicle registration plates. The sliding window bounding box is also a solution for the segmentation of plates. Musaddid et al. [81] developed a technique using a CNN and a sliding window bounding box to separate the characters from an Indonesian license plate.

Character recognition is a technique that transforms handwritten scanned images. It can also convert printed text into machine-readable text, either physically or electronically. CNNs, K-means, and DNNs are some of the classifiers that may be used to recognize characters. Bismantoko et al. [82] proposed a method based on CNN and image enhancement for ALPR character identification. Ariff et al. [83] proposed a method for segmenting license plate images that incorporated a K-Mean enhanced version of Fast K-Mean clustering. When K-Mean and fuzzy C-Means are compared, fast K-Mean is shown to be a far more effective, quick, and practical method. Li et al. [84] proposed an approach based on DNNs in which a cooperatively trained network has been presented for simultaneous license plate localization and character recognition in a single forward pass, achieving both goals with a single network. However, in this case, the whole network may be trained end-to-end without any intermediary processing such as character separation or image cropping.

3.3.2. Vehicle Shape and Appearance

Vehicle shape and appearance are crucial vehicle characteristics for vehicle recognition. Accurate detection and recognition of vehicles could help traffic control authorities identify prohibited vehicles during traffic monitoring. The same shape and appearance of a vehicle might be erroneously classified into several categories in traffic surveillance videos due to complicated backgrounds, illumination variations, varying road conditions, and varied camera perspectives.

There are methods for recognizing vehicles based on their shapes, such as their longitudinal length [85] and three-dimensional curve probes [86]. The recognition of vehicles also relies on the shapes that they use. Tan et al. [87] proposed an algorithm for recognizing templates to create a method that can identify cars when seen from the side. The model is created by combining information from various car shapes.

Vehicles are also recognized using appearance-based techniques such as edge, corner, and gradient characteristics. Petrovic et al. [88] have shown that a somewhat modest collection of characteristics taken from parts of the car in images from the front may be employed to provide high-performance vehicle type recognition and verification (both class and car model) for traffic monitoring applications and secure access. Furthermore, the system has been demonstrated to be reliable in a variety of weather and lighting circumstances. Lu et al. [89] proposed a one-of-a-kind recognition system for vehicle make and model recognition based on the images of the vehicles’ front ends. Basically, the vehicle make and model recognition system is comprised of two subsystems: the training subsystem and the testing and classification subsystem. Both of these subsystems can function independently of one another. There are different vehicle make and model recognition systems that are based on feature detection algorithms that have the potential to operate less well in conditions of poor illumination and/or occlusion. Zafar et al. [90] proposed a real-time, appearance-based make and model recognition method for automobiles that addresses poor illumination and/or occlusion through the use of two-dimensional linear discriminant analysis. This study also concludes that the two-dimensional linear discriminant analysis-based algorithm outperforms the principle component analysis-based approach. Huang et al. [91] proposed a method in which high recognition was achieved for twenty different categories of license plates by extracting regions of interest with respect to the license plate in a particular location by applying the two-dimensional linear discriminant analysis technique to the region of interest gradients. Because of the selection of the feature of “region of interest” gradients for two-dimensional linear discriminant analysis, it is not sensitive to color or light. The algorithm’s computational cost is low enough that it can be implemented in real time.

3.3.3. Vehicle Color

Color spaces are very important in color identification applications, such as vehicle color recognition. The chosen color space will have an impact on how well the recognition system performs. The most practical color space is RGB, although it has a problem recognizing colors. As a result, it is more challenging to discriminate between colors when utilizing the RGB color space because each channel of the RGB color space contributes equally. Most of the time, scientists will transform data from the RGB color space to one of the other color spaces that separate color from lighting, such as the CIE Lab or HSV, rather than using it as their primary color space. Rachmadi et al. [92] proposed a technique that involves converting the input image to two distinct color spaces, HSV and CIE Lab, before giving it to a CNN architecture for the recognition of vehicle colors. They demonstrated that CNN is capable of learning classification based not only on shape information but also on color distribution.

3.3.4. Vehicle Logo

The logo of a vehicle is also an essential component of vehicle identification because it cannot be simply altered. Recognizing the vehicle’s logo has a significant role in assessing the behavior of the vehicle. Over the course of the last decade, several vehicle logo-based approaches have been suggested. In the early studies, handcrafted descriptors were utilized for the logo identification task. Contrarily, the following negative aspects of handcrafted descriptors exist: (1) the design of handcrafted descriptors requires substantial prior knowledge, and such descriptors are heuristic in nature; and (2) the generalization ability of handcrafted descriptors is poor for complex object recognition tasks. Researchers looked at several learning approaches in an effort to find a solution to this problem. One of these learning approaches is deep learning strategies that are used by Yuxin et al. [93] for vehicle recognition.

4. Vehicle Behavior Understanding

When both the dynamic and static characteristics of the vehicle have been gathered, the next step is to examine the vehicle’s behavior. Generally, understanding the behavior in traffic surveillance describes how a vehicle’s location or speed changes in space and time throughout one video. This section demonstrates how to do motion analysis on a moving vehicle using a single camera as well as multiple cameras.

4.1. Track Vehicles Using Single Camera

The positions of the cameras installed on the network of roads provide accurate coordinates. Vishwakarma et al. [94] concentrated on understanding the behavior of humans through gesture, action, interactions, and behavior or activity. These concepts can be used to interpret traffic surveillance vehicle behavior. The main challenges in behavior understanding are building algorithms for training and matching that can successfully handle subtle variations in feature data within each motion pattern class [95]. There are two key steps that must be taken in order to comprehend the behavior: first, a reference behavior dictionary must be created, and second, each observation must be confirmed to see if a match can be found in the dictionary. In the context of traffic surveillance systems, “slow motion”, “fast motion”, “turning left”, “U-turn”, and “moving right” are examples of behaviors that could be termed reference behaviors. These reference behaviors are useful for identifying explicit events and detecting anomalies. There are two approaches in traffic surveillance systems for analyzing vehicle behavior. The first is known behavior understanding via trajectory analysis, which entails analyzing a vehicle’s motion trajectory data. The second approach, referred to as “behavior understanding without trajectory”, involves determining crucial details such as the vehicle’s size, direction, and speed.

4.1.1. Behavior Understanding Using Trajectory Analysis

Videos taken during surveillance operations can be used to characterize the motion trajectories of moving dynamic objects (such as vehicles and people) in a given geographic scene. A trajectory is a broad generalization of the direct path taken by a moving object, which contains numerous spatiotemporal details such as the location and direction. By tracking an object from one frame to the next and connecting its positions in subsequent frames, it is important to monitor the object from one image to the next in order to create a motion trajectory in the video. Today, the majority of traffic surveillance systems focus on motion trajectory analysis for understanding vehicle behavior on the basis of learning. There are three processes that are most critical for learning and understanding trajectories: retrieving, modeling, and clustering.

Trajectory retrieval is the process of obtaining a trajectory. The query trajectory is connected to the relevant trajectory pattern using posterior probability estimation. The video that has been retrieved is then ranked using the posterior probability that is calculated using Bayes’ prior probability theory. In contrast to video image retrieval, which produces a predetermined collection of images, video trajectory retrieval produces a predetermined collection of dynamic object trajectories [96]. The efficiency of querying and analyzing video objects can be improved by trajectory retrieval in order to find the data required for some applications. For managing and analyzing spatiotemporal data, trajectory retrieval has attracted a lot of attention. Two widely used methods for describing trajectories are string matching [97] and sketch matching algorithms [98]. The approach based on strings can transform a trajectory into a string and then find similarities with other strings based on semantic meanings. On the other hand, the procedure that is based on sketches creates a trajectory based on fundamental functions and matches them to geometrical properties at a low level. The ability to obtain images that match a query with a hand-drawn sketch is a much-wanted feature, particularly with the huge trend of devices having touch screens.

There are two stages of trajectory clustering: (1) partitioning, in which each trajectory is divided into a series of line segments. These line segments will be delivered to the subsequent phase. (2) The clustering phase: similar line segments are grouped together. The primary objective of the process is to choose the appropriate number of trajectories, and then groupings occur automatically. Erroneous trajectory clustering can occur when the number of trajectory clusters is misconfigured. There are some clustering approaches: spectral clustering and agglomerative clustering. K-means method and density-based spatial clustering of applications with noise Spectral clustering is the first widely employed clustering approach, and it has performed better than various traditional clustering techniques in a variety of situations. Considering each data point as a graph node, spectral clustering was used by Wang et al. [99] to complete trajectory clustering and aid in the anomaly’s identification and prediction. Agglomerative clustering is the second most common kind of hierarchical clustering, in which items are grouped into clusters depending on their commonality. Agglomerative clustering is used in vehicle scheduling for cross-regional collections. Wei et al. [100] proposed a cross-regional collection schedule using garbage collection route optimization and improved hierarchical agglomerative clustering algorithms to accomplish smart allocation of garbage and route planning for trucks. Zhang et al. [101] explore and discuss clustering algorithms such as the K-means method and density-based spatial clustering of applications with a noise algorithm such as trajectory clustering.

Every feature of a trajectory cluster is a distinct collection of trajectory patterns. They are used in developing a model of the trajectory based on the statistical distribution seen in each cluster. The technique of trajectory cluster modeling, which is often referred to as “trajectory pattern learning”, includes both a hierarchical Dirichlet process and a Dirichlet process mixture model. Scenes can be comprehended on the basis of their trajectory by utilizing the Dirichlet Process Mixture Model [102]. In order to learn trajectory patterns into its parameter variables in an unsupervised manner, Bastani et al. [103] employed a modified hierarchical Dirichlet process. This model overcomes certain constraints in the trajectory clustering issue, such as sequential analysis, incremental learning, and non-uniform sampling, due to the Bayesian structure that it received from its predecessor.

4.1.2. Behavior Understanding Using without Trajectory Analysis

The alternative method for understanding behavior is based on non-trajectory data, for example, direction, velocity, size, flow, and queue length. The basic concept is to identify anomalous events based on the target’s rapid changes in velocity, position, and target direction or if the specific behavior feature fails to meet a predetermined threshold rule. Saligrama et al. [104] presented a set of unsupervised video anomaly detection algorithms.

4.2. Tracking Vehicles Using Multiple Cameras

Video surveillance with multiple cameras is an excellent idea to keep an eye on things and is often accomplished using five essential processes: multi-camera calibration, camera view topology computation, multi-camera activity analysis, object reidentification, and multi-camera tracking. In video surveillance systems, object tracking accuracy and robustness are enhanced by combining information about objects gathered from various camera positions. One camera passes objects from one to another without pausing to observe over long distances. Most published multi-camera surveillance results rely on small camera networks and concentrate on tracking particular objects and examining activity, such as unpredictable motion trajectories and routine vehicle activity. Networked surveillance also keeps an eye on object activity and provides some conclusions, such as forecasting the road network’s traffic. The aforementioned aspects are covered by Wang et al. [105] in detail. The following section covered two aspects: movement prediction and anomaly detection.

To effectively analyze the future trajectory of moving objects on road-related networks, it is necessary to consider both the position and the movement characteristics of the vehicle. This involves predicting not only where the vehicle will be in the future, but also the vehicle’s future heading angle and the speed of the vehicle in front. In order to achieve this, advanced predictive models and algorithms can be utilized that can effectively model the complex dynamics of road-related networks and account for various factors that impact the movement of vehicles, such as traffic flow, road geometry, weather conditions, and more. By incorporating these advanced models into the future trajectory analysis of moving objects, it is possible to obtain a more accurate and comprehensive understanding of the movement patterns of vehicles on road-related networks, which can inform decision-making and improve traffic management strategies. In particular, they include physics-based prediction models based on kinematic models [106], trajectory prediction modules based on the combination of an interactive multiple model-based motion model and maneuver-specific variational GMMs [107], trajectory prediction models based on an HMM [108], a recurrent neural network model [109], and a long short-term memory model [110], etc.

An anomaly detection process is a systematic approach to identifying unusual or unexpected behavior or patterns in a dataset. The following steps outline the general process of anomaly detection. The data collection process begins with the deployment of vision sensors in the surveillance zone. The raw visual data obtained from these sensors is then pre-processed to prepare it for feature extraction. Key features are then extracted from the processed data to form a representation of the surveillance targets. The extracted information is then fed into a modeling algorithm, which uses a learning method to model the normal behavior of the targets. This model is then used to evaluate the behavior of the targets and determine whether it is abnormal or not. The goal of this process is to detect any unusual activity or behavior that deviates from the expected norm. Numerous researchers have utilized different methods to detect anomalies. These approaches include HOG, histogram of optical flow [111], trajectory-based, deep sparse rectifier neural networks [112], convolutional neural networks [113], recurrent neural networks [114], long short-term memory [115], and generative adversarial networks [116].

5. Traffic Software Applications in ITMS

The next component is traffic software applications in ITMS. There are different traffic software applications, such as Waze, Google Maps, Navigator, TomTom GO, TomTom GO, HERE WeGo, MapQuest, INRIX, Citymapper, Waze for Cities, TransNav, OptiMap, TransModeler, Vissim, Aimsun Next, PTV Visum, PTV Vistro, PTV Map&Guide, PTV xServer, TomTom Traffic, TomTom Maps, HERE HD Live Map, and so on, that employ the generated data in real time. These applications provide navigation, real-time traffic information, route optimization, and other features to the intelligent traffic management system (ITMS) to help drivers make informed decisions on the road. They are constantly updated to provide the latest information and new features to improve the driving experience. It is challenging to determine which traffic software application is “better” because it primarily relies on personal demands and preferences. Some of the previously mentioned traffic software applications, which will be covered in the next section, have received a lot of positive feedback for the precision of their data, the real-time traffic updates that they provide, and the user-friendly nature of their user interfaces.

The crowdsourced traffic information that is Waze is GPS navigation software that employs user-generated data to give drivers real-time traffic updates and navigational assistance. It can be used to give data on traffic flow and congestion as a part of an intelligent traffic management system (ITMS). Waze data may be evaluated and utilized to optimize traffic signals, enhance road layouts, and provide information for other traffic management choices. Waze is a useful tool for ITMS to increase traffic efficiency and safety since it can inform users about road closures, accidents, and other occurrences. Z. Lenkei [117] talked in the form of a case study; the opportunities and limitations of using spatial crowdsourcing technologies to detect non-recurring occurrences were investigated, and insight into the geographical and temporal aspects of the Waze data was offered. Zhuhua Zhang et al. [118] discussed the exploration and assessment of crowdsourced probe-based Waze traffic speed in the article.

Both Google Maps [119] and HERE WeGo [120] have garnered a lot of positive feedback for the comprehensive mapping and routing features that they offer. It is possible to employ Google Maps and HERE WeGo as components of an intelligent traffic management system (ITMS), which may then be used to deliver real-time traffic information and assist in the improvement of traffic flow. The app receives its traffic updates, which may include information on levels of congestion, accidents, road closures, and other occurrences, from GPS-enabled devices that provide the data in real time. This data may be studied and put to use in various ways to enhance road designs, traffic lights, and other aspects of traffic management. In summary, Google Maps provides valuable data and technologies that can be used to improve the efficiency and safety of traffic systems, making it an essential component of an ITMS. Supiya Naiudomthum et al. [121] used the Google Maps API, called the distance matrix API, to estimate the average speed on various road sections in Bangkok. This API provides details of the duration and distance between two points on any road section (starting node and ending node) from the Google Maps application. They used this API to determine the average speed. Together with a traffic model, the near real-time traffic data provided by the Google API was used to make predictions about the volume of traffic and to aid in the selection of speed-related emission factors that were suitable for the development of a near real-time traffic emission inventory in Bangkok. In addition, the map data provided by HERE WeGo can be utilized to support advanced traffic management systems. Some examples of these advanced traffic management systems include dynamic routing, incident management, and traveler information services. Because of the app’s real-time traffic statistics, as well as its API and mapping capabilities, ITMS can make better use of it to increase both the flow of traffic and the efficiency of its operations.

TomTom [122] also provides companies and government agencies with a variety of mapping and traffic services, such as data on real-time traffic conditions, historical traffic trends, and traffic forecasts. TomTom’s services include these features. This information may be included in ITMS in order to enable more advanced traffic management systems, such as dynamic routing, incident management, and traveler information services.

INRIX also provides companies and government agencies with a package of traffic analytics and management services, such as traffic prediction and simulation, dynamic routing, and incident management. This information may be included in ITMS in order to enable advanced traffic management systems, enhance traffic flow, and make traffic management more efficient. A. Sharma et al. [123] discussed the potential and limitations of using INRIX data for real-time performance monitoring and historical trend assessment.

Traffic software applications face a number of difficulties as well. The accuracy and dependability of technologies such as GPS, traffic sensors, and real-time traffic data are essential to the operation of traffic software systems. It is possible that the efficacy of traffic software applications will suffer if these technologies do not work as expected or are not widely available. There are privacy issues that might arise as a result of certain traffic software applications’ collection and usage of personally identifiable information such as location data. The precision of traffic software applications may be contingent on the data provided by users, which is not guaranteed to be current or correct in every instance.

6. ITMS Applications

This section covers a wide range of ITMS applications that all serve to highlight the effects of video-based network vehicle monitoring systems, including environmental impact assessment, safety monitoring, and TSCS.

Anomaly detection: Traffic congestion and vehicle accidents are both made more likely by driving in a way that is against the law. The use of video surveillance allows for the detection and enforcement of a variety of driving offenses, including taking an incorrect turn and failing to stop at a red light. Rajeshwari et al. [124] presented a survey article in which numerous techniques to manage and locate vehicle accidents on a street using a surveillance camera are examined, and the research also includes a brief assessment of various autonomous road and street accident detection methodologies.
Security: A network-based surveillance system can record a vehicle’s trajectory across the road network to track a specific vehicle of interest. In combination with online streaming of real-time video, this technology helps law enforcement agencies benefit from monitoring and preventing criminal activity. By addressing the issue related to security, Fedotov et al. [125] discussed a method based on the processing of video and audio that is going to be used in an effort to determine whether people have committed crimes. This will trigger an alarm at the nearby surveillance station, which may already be in control of a large number of CCTV images from neighboring places. The security personnel who are responsible for keeping an eye on multiple screens at the same time will find this to be beneficial.
The collection of vehicle tolls: The planning, execution, and dissemination of information concerning the autonomous operation of the vehicle selection system are the primary focuses of the ITMS. The vehicle toll collection device locates passing vehicles and compiles toll data that can be read by video sensors that detect the features of a vehicle, particularly its license plate number, as it passes through charging ports such as highway exits and entrances or parking lots. Nowadays, RFID-based toll systems [126] are used for collecting tolls without human intervention.
Road construction and transportation planning: A monitoring system can detect bottlenecks and other anomalies in traffic flow by tracking traffic patterns. Real-time traffic data, the existing road network, and the planned road network are the three components that are used to develop an intelligent transportation system for roads. There are many examples of intelligent planning and research on urban congestion in the transportation sector. One of the things conducted by Zhu et al. [127] was to address the congestion issue.
Environmental impact assessment: Since quite a few years ago, the development of environmentally friendly transportation systems has been seen as one of the most significant approaches to addressing environmental issues such as climate change and the impacts of greenhouse gases. On a worldwide basis, the transportation industry has emerged as one of the most significant contributors to the aforementioned environmental issues. Hassouna et al. [128] discussed the environmental implications that are already happening and those that will happen in the future because of the transportation sector.
TSCSs: Traffic information in real time should be made available to drivers as quickly as possible to manage congestion efficiently. Because of the significant progress that has been made in recent years in the fields of CV and ML, it is now technically possible to design intelligent traffic signaling systems that do not require human monitoring. These systems can function without human intervention. In order to program traffic lights, you need to know the factors that characterize the traffic, such as the number of cars, how frequently they enter and exit the area, and so on. We only go into detail about this specific aspect in Section 7, which is utilized to design traffic signal control systems in all the aforementioned applications.

7. Traffic Signal Control Systems (TSCS_S)

People are leaving their hometowns in search of places that provide greater employment opportunities and a higher quality of life than what they can find in their current locations. As a direct consequence of the fast urbanization that is taking place, cities are seeing growth in the total amount as well as the variety of traffic. The problems caused by traffic are as follows:

Increases the total amount of travel time;
The use of fuel between intersection lines;
Increased contributions to the air pollution caused by emissions;
Unfortunate events;
Managing an emergency is challenging.

The result is the need for an effective system of managing and controlling traffic to reduce road traffic congestion through the transportation system. Several initiatives have been taken by the government and researchers to alleviate traffic congestion.

As a result of this, every county has implemented its own traffic management system (TMS) [129] to deal with issues relating to traffic. The majority of the work that went into developing TMS was made feasible by advances in technology, which made transportation more effective. In addition to identifying issues on the road, directing vehicles to their destinations, managing parking lots and lanes, and monitoring real-time traffic while keeping an eye on traffic flow, One component of the traffic management system is known as the TSCSs. It is the responsibility of the TSCSs to ensure proper operation in order to reduce the number of enforcement issues. There are some common terms in TSCS, which are:

An intersection: An intersection is a place where two or more roads come together. This area is set up so that vehicles can go where they want to go in many different ways.
A lane: A route may be divided into many lanes, each of which may be used by a single line of vehicles. The width of the road and the volume of traffic on it determine the number of lanes that are present on the road.
Movement signal: This is a traffic light that indicates the flow of traffic.
Phase: Phases are the order in which the traffic lights are set to allow only specific traffic flows to pass the intersection at a specific time in the administration of the traffic signal timing plan. This restricts the volume of vehicles that can pass through the intersection at once.
Cycle length: This is the moment when all phases are provided once in a cyclic sequence with green time.

The following list provides descriptions of the six different kinds of TSCSs.

7.1. Metaheuristics-Based Traffic Signal Control System Approach

This section focuses on the metaheuristic techniques applied in the optimization of signal systems. These approaches often draw inspiration from natural phenomena such as evolutionary theory, physical processes, and bird and insect swarming behaviors to solve numerical optimization problems. The objective of using metaheuristics is to determine the optimal values or ranges of multiple signal parameters that impact the performance of signalized intersections, such as cycle duration, green splits, phase sequence, offsets, change interval, etc.

A variety of metaheuristic optimization methods have been developed, inspired by natural or physical events. These include genetic algorithms (GAs), cultural algorithms (CAs), simulated annealing (SA), ant colony optimization (ACO), differential evolution (DE), particle swarm optimization (PSO), and tabu search (TS). The approach based on metaheuristics for traffic signal control system and its comparison with a similar method is listed in Table 2.

7.2. Hybrid Metaheuristics-Based Traffic Signal Control System Approach

The hybrid metaheuristics technique is employed to determine the optimal values or ranges of multiple signal characteristics affecting the performance of signalized intersections. This approach utilizes two or more distinct metaheuristics methodologies. The details of the hybrid metaheuristics-based traffic signal control system and a comparison to a similar method can be found in Table 3.

7.3. Fuzzy-Logic-Based Traffic Signal Control System Approach

A fuzzy logic (FL)-based traffic light control system is a more flexible option compared to traditional traffic light management, offering the ability to handle a wider range of traffic patterns at an intersection. The aim of this approach is to increase vehicle throughput and reduce delay times. The benefits and key features of the FL-based system are listed in Table 4, providing a comparison to other methods.

7.4. Convolutional Neural Network (CNN)-Based Traffic Signal Control System Approach

The field of intelligent traffic management has seen the use of IoT, time series forecasting, and digital image processing in previous research. An emerging area is the application of computer vision to intelligent traffic management. One such algorithm has been proposed that utilizes machine learning and deep learning techniques, specifically convolutional neural networks (CNNs), for real-time traffic signal optimization. The algorithm forecasts the optimal amount of time needed for vehicles to clear the lane. Table 5 highlights the CNN-based traffic signal control system approach and a comparison to other methods.

7.5. Reinforcement-Learning-Based Traffic Signal Control System Approach

The reinforcement learning approach is a type of machine learning that focuses on how intelligent agents can make actions in their environment to maximize the accumulated reward. The reinforcement-learning-based traffic signal control system approach and a comparison to similar methods are outlined in Table 6.

7.6. Hybrid-Based Traffic Signal Control System Approach

This hybrid method combines two separate approaches or systems to create a new and improved model. The hybrid-based traffic signal control system approach is applied and its highlights are presented in Table 7, along with a comparison to a similar method.

8. Simulator

Modeling and simulation can provide valuable insights into the behavior of traffic systems. Simulation tools are important in evaluating the performance of traffic systems under various scenarios. Simulation replicates real-world systems and processes to obtain information faster using models of traffic movement. These models help to describe the physical transmission of traffic flow. Models are categorized into macro, micro, and meso scale models based on their level of specialization. Macroscopic modeling is a mathematical modeling approach that analyzes correlations between traffic stream characteristics such as density, flow, mean speed, and other traffic flow parameters. Examples of macroscopic modeling include Saturn, Visum, TRANSYT, etc.

Agent-based simulation uses microscopic modeling which explicitly simulates the behavior of individual vehicles and drivers. This makes it suitable for the study of complex traffic issues, including intelligent transportation systems, complex intersections, traffic waves, and event impacts. Examples of microscopic modeling software include Simulation of Urban Mobility (SUMO), MATSim, Quadstone (Q) Paramics, Corsim, Vissim, Mainsim, Dracula, and MITSIMLab.

MESO stands for “mesoscopic simulation model”, and it is a type of simulation that utilizes the same input data as the primary SUMO model. This type of simulation is faster and can be executed up to 100 times quicker than the microscopic model of SUMO. It calculates vehicle movements using queues and is more tolerant of network modeling errors because it uses a coarser model for intersections and lane changes than SUMO. Some examples of mesoscopic modeling software include Aimsun and TransModeler. Table 8 summarizes the features of various simulation types.

9. Discussion

In the previous sections, we presented the techniques for monitoring vehicles in ITMS. Although there are still open questions and areas for improvement, future research will continue to advance the capabilities of video-based traffic surveillance systems. In this section, we discuss our outlook on the potential future developments of ITMS by discussing enhancing system efficiency, surveillance system on a network, how weather forecasts, incident reports, and online weather data are integrated into ITMS, comprehensive knowledge of traffic scenes, the role of vehicle spatial occupancy, and strategies needed for developing efficient ITMS. These highlight the need for continued research and development in ITS, to fully realize its potential for improving traffic management and safety.

9.1. Enhancing System Efficiency

The performance of current surveillance systems often decreases in complex traffic situations, such as when vehicles are partially obscured, their position or orientation changes, or lighting conditions fluctuate. We have outlined the difficulties faced in each component of video surveillance systems and the related existing solutions in previous sections. In this section, we highlight some particularly challenging issues.

9.1.1. Dealing with Occlusions

Vehicle occlusion occurs when 3D traffic scenes are transformed into 2D images, resulting in the loss of visual information about the vehicle. This can lead to incorrect detection of obscured objects. To address this, some methods focus on using the visual information of the visible portions of the object while disregarding the occluded parts.

Dealing with occlusions can be approached:

Detecting the presence of occlusion: The presence of occlusion can be determined by observing previous detection results or by evaluating the response of an object detection model.
Handling the occlusion: There are several methods for handling occlusions, including using machine learning to learn a model of occluded objects and detect them using the learned model, or learning the object model without occlusion and detecting it with a designated mask. These methods aim to make use of the visual information of the visible portions of the object, while disregarding the occluded parts.
Advanced image processing techniques: Techniques such as image enhancement, segmentation, and restoration can be used to extract additional information from partially obscured images, reducing the impact of occlusions
Multi-camera systems: Using multiple cameras in a surveillance system can provide a wider field of view, allowing for a more comprehensive view of the traffic scene and reducing the impact of occlusions.

9.1.2. Dealing with Types of Vehicle and Their Position Changes

The challenge posed by changing vehicle poses during road travel can be problematic for video surveillance systems. Vehicles often change their appearance, such as by changing lanes or making turns, resulting in completely different images of the same object. This creates difficulties for appearance-based algorithms, which can struggle with the wide variability in intra-vehicle appearance and the lack of inter-vehicle differentiation. One solution is to use motion-based detection, which is not affected by pose variations. However, this approach can be susceptible to the shadow problem and may not accurately identify vehicles, as the detected moving object may not necessarily be a vehicle. In such cases, it may be necessary to explicitly detect and remove shadows to improve the performance of the system.

9.1.3. Adapting to Variations in Illumination

Different climatic patterns and times of day cause changes in light, resulting in significant variations in object appearance. The vehicle’s texture is apparent in bright lighting circumstances, but the majority of the vehicle’s data are not visible in dim lighting, such as at night. Features that are not influenced by different lighting, such as SIFT and HOG, are commonly employed to reduce the impact of illumination change. The majority of the vehicle’s characteristics are not visible, especially at night. One option is to employ extra camera illumination equipment or focus on the vehicle’s only visible elements, such as the headlight and taillight.

9.1.4. Telematics and CVISs Are Being Used for Surveillance Variations

Telematics refers to the integration of telecommunications and informatics to provide real-time information to vehicles. In the context of traffic management, telematics can be used to provide drivers with real-time information about traffic conditions, road closures, and other important updates. Cooperative vehicle-infrastructure systems (CVISs) are systems that allow vehicles and infrastructure to communicate with each other to improve traffic flow and reduce accidents. For example, a CVIS could allow a vehicle to communicate its speed and position to the traffic management system, which could then use that information to optimize traffic flow and reduce congestion. Both telematics and CVISs play a critical role in modern traffic management systems by providing real-time information and enabling two-way communication between vehicles and infrastructure. This helps to improve safety, reduce congestion, and enhance the overall driving experience. Future studies should look at similar techniques.

9.2. Surveillance System on a Network

The utilization of a single-camera-based surveillance system only allows for monitoring of traffic within the field of view of the camera, hindering overall awareness. With advancements in network technology and the growth of the Internet of Things, there is a trend toward the interconnectivity of cameras on the road. This new system not only observes a vehicle’s behavior at a single camera node, but also analyzes it across the road network. The cameras, due to their fixed physical location on the network, act as a location-based service (LBS). Currently, the most commonly used sensors for obtaining object trajectories over a wide range are RFID and GPS, with GPS being the primary means of extracting vehicle trajectories. While many GPS-based trajectory analyses have been conducted, they tend to focus on fleet vehicles such as taxis or trucks, which may not accurately represent typical driving patterns. In contrast, the networked surveillance system, while still collecting location information, offers additional features and capabilities.

A camera equipped with a GPS sensor can indicate the location of a vehicle on a network of roads. Only discrete locations within deployed camera views are collected by the networked system, but GPS may acquire an ongoing journey on the road network. On the other hand, vehicle behavior is generally evaluated based on individual road sections. Thus, the camera network’s granularity is suitable for analyzing the behavior of the network.
The surveillance system may also detect the vehicle’s specific characteristics, such as the vehicle logo, vehicle color, license plate number, etc. As a result, trajectory analysis may be performed based on these characteristics, such as evaluating bus trajectories, vehicle trajectories, and even the trajectories of cars of various colors and manufacturers. In this aspect, the networked system outperforms the GPS-based system, making interest in anomaly detection, motion prediction, trajectory pattern discovery, and other areas desirable.
Multicamera tracking has been studied, but it typically relies on cameras that have overlapping or close proximity, which is not always feasible in road networks due to camera distance. In such cases, vehicle reidentification algorithms can be used to track the same vehicle over long distances. However, reidentification requires the camera to keep track of the way different cameras have seen the same object. The networked traffic camera topology in road networks is challenging to obtain and maintain due to the large number of camera nodes, making it difficult to monitor object models. An alternative approach is to perform multicamera tracking within the vicinity of each camera to accommodate regular vehicle movement from one camera node to another.

9.3. Comprehensive Knowledge of Traffic Scenes

The study and explanation of individual interactions and behavior between objects for visual surveillance are characterized by “behavior understanding”. Traffic surveillance, in our opinion, entails monitoring the static and dynamic properties of traffic and then examining how they influence traffic situations in real time. With a networked surveillance system, it is possible to better understand traffic situations. The dynamic and static properties of all types of vehicles moving on the highway and road, and their qualities on the road network, should be retrieved and evaluated. It is necessary to evaluate the entire transportation and traffic scenario. Other traffic objects, such as traffic lights, signs, and people, can be identified for traffic surveillance to better understand vehicle behavior. For instance, by looking at both the traffic signal status and the vehicle trajectory, a vehicle running a red light could be located.

One of the benefits of a networked surveillance system is the ability to perform higher-level transportation analysis. By combining information from vehicle tracking and vehicle type classification, the system can estimate the environmental impact of transportation in terms of emissions from the consumption of petroleum and oil. Additionally, the analysis of vehicle trajectories can provide insights into traffic patterns and identify congested areas or bottlenecks.

9.4. Role of Vehicle Spatial Occupancy

Using vehicles as queueing system elements might be misleading. If we suppose that the car’s length is half that of the bus’s, the time it takes the bus to cross the signal will be double that of the car if both are moving at the same speed, which is usually the case at traffic intersections. This means that the time it takes to clear the backlog is not exactly proportional to the number of cars. If the spatial occupancy of vehicles is assumed, by assuming there is no bus and there are two cars, it is used to calculate the departure rate and gives a better result than counting vehicles. In the future, this approach could help develop accurate signal timing.

9.5. ITMS Integrated with Online Weather Data

Weather information that can be accessed over the internet is what is meant by the term “online weather data”. It often originates from government weather agencies, private weather organizations, and weather monitoring stations, and it details the present weather conditions as well as forecasts and historical data pertaining to the weather. When integrated with online weather data using a fuzzy neural network (FNN) prediction system [164], a long short-term memory model [165], GRU-based deep learning [166], and a deep bi-directional long short-term memory (LSTM) stacked autoencoder (SAE) architecture [167], and intelligent transportation systems (ITMS) can give drivers and transportation authorities real-time information on the state of both the roads and the weather. This information can assist drivers in making educated decisions about their journeys, such as avoiding places that are impacted by adverse weather, and it can also enable transportation authorities to put measures in place to ensure road safety. Even well-known mobile applications for weather information, such as The Weather Channel, WeatherProHD, and Yahoo! Weather, all provide weather information in an hourly observation style.

9.6. ITMS Integrated with Weather Forecasts

The term “weather forecasting” refers to the process of predicting future weather conditions by analyzing both current and historical data. It is a useful instrument that assists individuals and organizations in preparing for probable weather-related disasters and responding to them when they occur. When integrated with weather predictions, intelligent transportation systems (ITMS) can offer transportation authorities useful information that can assist in the planning and preparation of future weather-related problems. With the use of weather predictions, transportation officials are able to obtain a head start on preparing for any potential interruptions to the road transportation system, such as rain, snow, or high winds that are expected in the near future. For instance, if a weather report predicts that a certain region is going to be hit with a significant amount of snow, the local transportation authorities can send out snow plows and other types of equipment to ensure that the roads remain safe. The application of big data analytics will produce more accurate outcomes in weather forecasting, assisting forecasters in making more precise predictions. To achieve this goal and provide viable solutions, Marzieh Fathi et al. [168] propose a number of big data approaches and technologies to handle and evaluate the massive volume of meteorological data coming from a variety of sources.

9.7. ITMS Integrated with Incident Reports

Incident reports are written summaries of incidents or events that have already taken place and have been documented. Incident reports are often used to discuss accidents, disturbances in traffic, or other incidents that have an effect on the flow of traffic or the safety of travelers when discussing transportation systems. Law enforcement agencies, emergency responders, and other groups that are tasked with providing assistance in the event of a transportation-related incident are frequently the ones that are responsible for compiling incident reports. Details such as the time and location of the event, the nature of the incident, the number of persons involved, and any road closures or detours that have been put in place as a result of the incident may be included in these reports. Drivers and transportation authorities are able to obtain real-time information about road events, such as accidents, road closures, and construction, if ITMSs are integrated with incident reports. Incident reports can assist transportation authorities in responding to events in a more timely and efficient manner, therefore mitigating the negative effects that incidents have on the road transportation system. ITMS may offer real-time information on road closures and recommend alternate routes to vehicles, which helps to minimize congestion and improve traffic flow. One example of this would be if an accident occurred. The research carried out by Nuntaporn Klinjun et al. [169] evaluates risk factor patterns for road traffic injuries and shows the findings of forensic road traffic investigation reports that were conducted in Thailand. There were 25 significant traffic accident incidents, and detailed forensic reports were collected for each of them. These are some common methods for dealing with different types of vehicles in video surveillance systems. The choice of method will depend on the specific requirements of the system and the resources available.

9.8. Strategies Needed for Developing Efficient TSCSs

There are several challenges that come with designing and implementing a traffic signal control system, including traffic volume variability, complex traffic patterns, coordination with other systems, limited data availability, cost and budget constraints, aging infrastructure, and integration with ITMS. Improving the efficiency of a traffic signal control system involves several strategies, which resolve the above-mentioned challenges. Many performance metrics help to compare different traffic signal control systems and to evaluate the effectiveness of changes made to existing systems.

10. Conclusions

Intelligent traffic management systems (ITMS) make use of video-based traffic monitoring technology, which has advanced significantly. This technology captures images of traffic scenes, analyzes traffic information, and comprehends their activities and behaviors. In this study, we present a comprehensive overview of the ITMS components, including vehicle surveillance, attribute extraction methods, tracking and identification on road networks, the applications used in ITMS, vehicle detection, ITMS applications, and behavior understanding. These components aim to provide a complete solution to traffic control problems and to aid in traffic management. Additionally, the study covers traffic control signal systems and includes a simulator where problem-solving strategies can be tested in action. The goal is to synthesize the existing studies and identify the most effective strategies and solutions for managing traffic in urban and rural environments in one place. The paper will also provide insights into the future direction of research in the area of traffic management. The ultimate objective of this review paper is to contribute to the advancement of the field of traffic management and to inform the development of more effective strategies for addressing the challenges faced by urban and rural communities. The future scope of traffic management systems is vast and promising. With knowledge of technologies, there are many new opportunities for improving the efficiency and effectiveness of traffic management systems.

Author Contributions

Conceptualization, N.N., D.P.S. and J.C.; methodology, N.N., D.P.S. and J.C.; investigation, N.N., D.P.S. and J.C.; writing—original draft preparation, N.N.; writing—review and editing, D.P.S. and J.C.; supervision, D.P.S. and J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zuraimi, M.A.B.; Zaman, F.H.K. Vehicle Detection and Tracking Using YOLO and DeepSORT. In Proceedings of the 2021 IEEE 11th IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang, Malaysia, 3–4 April 2021; pp. 23–29. [Google Scholar]
Indrabayu; Bakti, R.Y.; Areni, I.S.; Prayogi, A.A. Vehicle Detection and Tracking Using Gaussian Mixture Model and Kalman Filter. In Proceedings of the 2016 International Conference on Computational Intelligence and Cybernetics, Makassar, Indonesia, 22–24 November 2016; pp. 115–119. [Google Scholar]
Sony. Available online: https://www.sony.com (accessed on 2 October 2022).
Onsemi. Available online: https://www.onsemi.com (accessed on 2 October 2022).
Hikvision. Available online: https://www.hikvision.com (accessed on 2 October 2022).
Citilog. Available online: https://www.citilog.com (accessed on 2 October 2022).
Opelt, A.; Pinz, A.; Zisserman, A. Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection. Int. J. Comput. Vis. 2008, 80, 16–44. [Google Scholar] [CrossRef] [Green Version]
Liu, Y.; Qiu, T.; Wang, J.; Qi, W. A Nighttime Vehicle Detection Method with Attentive GAN for Accurate Classification and Regression. Entropy 2021, 23, 1490. [Google Scholar] [CrossRef]
Sun, Z.; Liu, C.; Qu, H.; Xie, G. A Novel Effective Vehicle Detection Method Based on Swin Transformer in Hazy Scenes. Mathematics 2022, 10, 2199. [Google Scholar] [CrossRef]
Rani, N.S.; Ponnath, N. Automatic Vehicle Tracking System Based on Fixed Thresholding and Histogram Based Edge Processing. Int. J. Electr. Comput. Eng. 2015, 5, 869–878. [Google Scholar]
Vikhar, P.; Rane, K.; Chaudhari, B. A Novel Method for Feature Extraction Using Color Layout Descriptor (CLD) and Edge Histogram Descriptor (EHD). IJITEE 2020, 9, 2147–2151. [Google Scholar] [CrossRef]
Lowe, D.G. Object Recognition from Local Scale-Invariant Features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 2, pp. 1150–1157. [Google Scholar]
Ma, X.; Grimson, W.E.L. Edge-Based Rich Representation for Vehicle Classification. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, Beijing, China, 17–21 October 2005; Volume 2, pp. 1185–1192. [Google Scholar]
Zhu, D.; Wang, X. A Method of Improving SIFT Algorithm Matching Efficiency. In Proceedings of the 2009 2nd International Congress on Image and Signal Processing, Tianjin, China, 17–19 October 2009; pp. 1–5. [Google Scholar]
Djenouri, Y.; Belhadi, A.; Srivastava, G.; Djenouri, D.; Chun-Wei Lin, J. Vehicle Detection Using Improved Region Convolution Neural Network for Accident Prevention in Smart Roads. Pattern Recognit. Lett. 2022, 158, 42–47. [Google Scholar] [CrossRef]
Lee, S.-H.; Bang, M.; Jung, K.-H.; Yi, K. An Efficient Selection of HOG Feature for SVM Classification of Vehicle. In Proceedings of the 2015 International Symposium on Consumer Electronics (ISCE), Madrid, Spain, 24–26 June 2015; IEEE: Madrid, Spain, 2015; pp. 1–2. [Google Scholar]
Ali, A.M.; Eltarhouni, W.I.; Bozed, K.A. On-Road Vehicle Detection Using Support Vector Machine and Decision Tree Classifications. In Proceedings of the 6th International Conference on Engineering & MIS 2020, Almaty Kazakhstan, 14–16 September 2020; ACM: Almaty, Kazakhstan, 2020; pp. 1–5. [Google Scholar]
Wang, Z.; Zhan, J.; Duan, C.; Guan, X.; Yang, K. Vehicle Detection in Severe Weather Based on Pseudo-Visual Search and HOG–LBP Feature Fusion. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2022, 236, 1607–1618. [Google Scholar] [CrossRef]
Mohamed, A.; Issam, A.; Mohamed, B.; Abdellatif, B. Real-Time Detection of Vehicles Using the Haar-like Features and Artificial Neuron Networks. Procedia Comput. Sci. 2015, 73, 24–31. [Google Scholar] [CrossRef] [Green Version]
Alam, A.; Jaffery, Z.A.; Sharma, H. A Cost-Effective Computer Vision-Based Vehicle Detection System. Concurr. Eng. 2022, 30, 148–158. [Google Scholar] [CrossRef]
Arunmozhi, A.; Park, J. Comparison of HOG, LBP and Haar-like Features for on-Road Vehicle Detection. In Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT), Rochester, MI, USA, 3–5 May 2018; pp. 362–367. [Google Scholar]
Chen, Z.; Ellis, T.; Velastin, S.A. Vehicle Detection, Tracking and Classification in Urban Traffic. In Proceedings of the 2012 15th International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK, USA, 16–19 September 2012; pp. 951–956. [Google Scholar]
Xu, Y.; Yu, G.; Wang, Y.; Wu, X.; Ma, Y. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG+ SVM from UAV Images. Sensors 2016, 16, 1325. [Google Scholar] [CrossRef] [Green Version]
Kurniawan, A.; Saputra, R.; Marzuki, M.; Febrianti, M.S.; Prihatmanto, A.S. The Implementation of Object Recognition Using Deformable Part Model (DPM) with Latent SVM on Lumen Robot Friend. In Proceedings of the International Conference on Engineering and Technology Development (ICETD), Lampung, Indonesia, 24–25 October 2017. [Google Scholar]
Karungaru, S.; Dongyang, L.; Terada, K. Vehicle Detection and Type Classification Based on CNN-SVM. Int. J. Mach. Learn. Comput. 2021, 11, 304–310. [Google Scholar] [CrossRef]
Freund, Y. Boosting a Weak Learning Algorithm by Majority. Inf. Comput. 1995, 121, 256–285. [Google Scholar] [CrossRef] [Green Version]
Wang, Y.; Feng, L. An Adaptive Boosting Algorithm Based on Weighted Feature Selection and Category Classification Confidence. Appl. Intell. 2021, 51, 6837–6858. [Google Scholar] [CrossRef]
Chen, C.-H.; Hsu, C.-C. Boosted Voting Scheme on Classification. In Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications, Kaohsuing, Taiwan, 26–28 November 2008; IEEE: Kaohsuing, Taiwan, 2008; pp. 573–577. [Google Scholar]
Li, Q.; Mou, L.; Xu, Q.; Zhang, Y.; Zhu, X.X. R ³-Net: A Deep Network for Multi-Oriented Vehicle Detection in Aerial Images and Videos. arXiv 2018, arXiv:1808.05560. [Google Scholar]
Tayara, H.; Soo, K.G.; Chong, K.T. Vehicle Detection and Counting in High-Resolution Aerial Images Using Convolutional Regression Neural Network. IEEE Access 2017, 6, 2220–2230. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Girshick, R. Fast R-Cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks. Adv. Neural Inf. Process. Syst. 2015, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-Cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single Shot Multibox Detector. In Computer Vision–ECCV 2016, Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
Liang, X.; Zhang, J.; Zhuo, L.; Li, Y.; Tian, Q. Small Object Detection in Unmanned Aerial Vehicle Images Using Feature Fusion and Scaling-Based Single Shot Detector with Spatial Context Analysis. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 1758–1770. [Google Scholar] [CrossRef]
Wang, H.; Yu, Y.; Cai, Y.; Chen, X.; Chen, L.; Liu, Q. A Comparative Study of State-of-the-Art Deep Learning Algorithms for Vehicle Detection. IEEE Intell. Transp. Syst. Mag. 2019, 11, 82–95. [Google Scholar] [CrossRef]
Javadi, S.; Dahl, M.; Pettersson, M.I. Vehicle Detection in Aerial Images Based on 3D Depth Maps and Deep Neural Networks. IEEE Access 2021, 9, 8381–8391. [Google Scholar] [CrossRef]
Madhogaria, S.; Baggenstoss, P.M.; Schikora, M.; Koch, W.; Cremers, D. Car Detection by Fusion of HOG and Causal MRF. IEEE Trans. Aerosp. Electron. Syst. 2015, 51, 575–590. [Google Scholar] [CrossRef]
Kusuma, T.; Ashwini, K. Multiple Object Tracking Using STMRF and YOLOv4 Deep SORT in Surveillance Video; IJRTI: Gandhinagar, India, 2022; Volume 7. [Google Scholar]
Yin, M.; Zhang, H.; Meng, H.; Wang, X. An HMM-Based Algorithm for Vehicle Detection in Congested Traffic Situations. In Proceedings of the 2007 IEEE Intelligent Transportation Systems Conference, Bellevue, WA, USA, 30 September 2007–3 October 2007; pp. 736–741. [Google Scholar]
Miller, N.; Thomas, M.A.; Eichel, J.A.; Mishra, A. A Hidden Markov Model for Vehicle Detection and Counting. In Proceedings of the 2015 12th Conference on Computer and Robot Vision, Halifax, NS, Canada, 3–5 June 2015; pp. 269–276. [Google Scholar]
Tao, H.; Lu, X. Smoke Vehicle Detection Based on Multi-Feature Fusion and Hidden Markov Model. J. Real-Time Image Process. 2020, 17, 745–758. [Google Scholar] [CrossRef]
Wang, C.-C.R.; Lien, J.-J.J. Automatic Vehicle Detection Using Local Features—A Statistical Approach. IEEE Trans. Intell. Transp. Syst. 2008, 9, 83–96. [Google Scholar] [CrossRef]
Jagannathan, P.; Rajkumar, S.; Frnda, J.; Divakarachari, P.B.; Subramani, P. Moving Vehicle Detection and Classification Using Gaussian Mixture Model and Ensemble Deep Learning Technique. Wirel. Commun. Mob. Comput. 2021, 2021, 5590894. [Google Scholar] [CrossRef]
Song, J. Vehicle Detection Using Spatial Relationship GMM for Complex Urban Surveillance in Daytime and Nighttime. Int. J. Parallel Program. 2018, 46, 859–872. [Google Scholar] [CrossRef]
Zhang, Y.; Zhao, C.; He, J.; Chen, A. Vehicles Detection in Complex Urban Traffic Scenes Using Gaussian Mixture Model with Confidence Measurement. IET Intell. Transp. Syst. 2016, 10, 445–452. [Google Scholar] [CrossRef]
Yao, Y.; Xiong, G.; Wang, K.; Zhu, F.; Wang, F.-Y. Vehicle Detection Method Based on Active Basis Model and Symmetry in ITS. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, The Netherlands, 6–9 October 2013; pp. 614–618. [Google Scholar]
Wu, Y.N.; Si, Z.; Gong, H.; Zhu, S.-C. Learning Active Basis Model for Object Detection and Recognition. Int. J. Comput. Vis. 2010, 90, 198–235. [Google Scholar] [CrossRef] [Green Version]
Zhou, Y.; Yuan, J.; Tang, X. A Novel Part-Based Model for Fine-Grained Vehicle Recognition. In Cloud Computing and Security, Proceedings of the International Conference on Cloud Computing and Security, Haikou, China, 8–10 June 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 647–658. [Google Scholar]
Li, D.L.; Prasad, M.; Liu, C.-L.; Lin, C.-T. Multi-View Vehicle Detection Based on Fusion Part Model with Active Learning. IEEE Trans. Intell. Transp. Syst. 2020, 22, 3146–3157. [Google Scholar] [CrossRef]
Chen, X.; Kundu, K.; Zhu, Y.; Ma, H.; Fidler, S.; Urtasun, R. 3d Object Proposals Using Stereo Imagery for Accurate Object Class Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 1259–1272. [Google Scholar] [CrossRef] [Green Version]
Chabot, F.; Chaouch, M.; Rabarisoa, J.; Teuliere, C.; Chateau, T. Deep Manta: A Coarse-to-Fine Many-Task Network for Joint 2d and 3d Vehicle Analysis from Monocular Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2040–2049. [Google Scholar]
Simon, M.; Amende, K.; Kraus, A.; Honer, J.; Samann, T.; Kaulbersch, H.; Milz, S.; Michael Gross, H. Complexer-Yolo: Real-Time 3d Object Detection and Tracking on Semantic Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 15–20 June 2019; pp. 1190–1199. [Google Scholar]
Li, B. 3d Fully Convolutional Network for Vehicle Detection in Point Cloud. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 1513–1518. [Google Scholar]
Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep Learning on Point Sets for 3d Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv. Neural Inf. Process. Syst. 2017, 30, 5105–5114. [Google Scholar]
Srivastav, N.; Agrwal, S.L.; Gupta, S.K.; Srivastava, S.R.; Chacko, B.; Sharma, H. Hybrid Object Detection Using Improved Three Frame Differencing and Background Subtraction. In Proceedings of the 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, Noida, India, 12–13 January 2017; pp. 613–617. [Google Scholar]
Saur, G.; Krüger, W.; Schumann, A. Extended Image Differencing for Change Detection in UAV Video Mosaics. In Proceedings of the Video Surveillance and Transportation Imaging Applications 2014, San Francisco, CA, USA, 2–6 February 2014; SPIE: Bellingham, WA, USA, 2014; Volume 9026, pp. 128–137. [Google Scholar]
Keck, M.; Galup, L.; Stauffer, C. Real-Time Tracking of Low-Resolution Vehicles for Wide-Area Persistent Surveillance. In Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV), Clearwater Beach, FL, USA, 15–17 January 2013; pp. 441–448. [Google Scholar]
Sommer, L.W.; Teutsch, M.; Schuchert, T.; Beyerer, J. A Survey on Moving Object Detection for Wide Area Motion Imagery. In Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, 7–10 March 2016; pp. 1–9. [Google Scholar]
Gupte, S.; Masoud, O.; Martin, R.F.; Papanikolopoulos, N.P. Detection and Classification of Vehicles. IEEE Trans. Intell. Transp. Syst. 2002, 3, 37–47. [Google Scholar] [CrossRef] [Green Version]
Siddharth, R.; Aghila, G. A Light Weight Background Subtraction Algorithm for Motion Detection in Fog Computing. IEEE Lett. Comput. Soc. 2020, 3, 17–20. [Google Scholar] [CrossRef]
Azeez, B.; Alizadeh, F. Review and Classification of Trending Background Subtraction-Based Object Detection Techniques. In Proceedings of the 2020 6th International Engineering Conference “Sustainable Technology and Development" (IEC), Erbil, Iraq, 26–27 February 2020; pp. 185–190. [Google Scholar]
Yuan, G.-W.; Gong, J.; Deng, M.-N.; Zhou, H.; Xu, D. A Moving Objects Detection Algorithm Based on Three-Frame Difference and Sparse Optical Flow. Inf. Technol. J. 2014, 13, 1863. [Google Scholar] [CrossRef] [Green Version]
Sun, W.; Sun, M.; Zhang, X.; Li, M. Moving Vehicle Detection and Tracking Based on Optical Flow Method and Immune Particle Filter under Complex Transportation Environments. Complexity 2020, 2020, 1–15. [Google Scholar] [CrossRef]
Luo, W.; Zhao, X.; Kim, T.-K. Multiple object tracking: A literature review. Artif. Intell. 2021, 293, 103448. [Google Scholar] [CrossRef]
Rin, V.; Nuthong, C. Front Moving Vehicle Detection and Tracking with Kalman Filter. In Proceedings of the 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), Singapore, 23–25 February 2019; pp. 304–310. [Google Scholar]
Kim, T.; Park, T.-H. Extended Kalman Filter (EKF) Design for Vehicle Position Tracking Using Reliability Function of Radar and Lidar. Sensors 2020, 20, 4126. [Google Scholar] [CrossRef]
Khalkhali, M.B.; Vahedian, A.; Yazdi, H.S. Multi-Target State Estimation Using Interactive Kalman Filter for Multi-Vehicle Tracking. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1131–1144. [Google Scholar] [CrossRef]
Abdelali, H.A.; Bourja, O.; Haouari, R.; Derrouz, H.; Zennayi, Y.; Bourzex, F.; Thami, R.O.H. Visual Vehicle Tracking via Deep Learning and Particle Filter. In Advances on Smart and Soft Computing 517, Proceedings of ICACIn 2020; Springer: Singapore, 2021; pp. 517–526. [Google Scholar]
Sudha, D.; Priyadarshini, J. An Intelligent Multiple Vehicle Detection and Tracking Using Modified Vibe Algorithm and Deep Learning Algorithm. Soft Comput. 2020, 24, 17417–17429. [Google Scholar] [CrossRef]
Shi, X.; Zhao, W.; Shen, Y. Automatic License Plate Recognition System Based on Color Image Processing. In Computational Science and Its Applications-ICCSA 2005, Proceedings of the International Conference on Computational Science and Its Applications, Singapore, 9–12 May 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1159–1168. [Google Scholar]
Al-Shemarry, M.S.; Li, Y.; Abdulla, S. Ensemble of Adaboost Cascades of 3L-LBPs Classifiers for License Plates Detection with Low Quality Images. Expert Syst. Appl. 2018, 92, 216–235. [Google Scholar] [CrossRef]
Chen, R.; Luo, Y. An Improved License Plate Location Method Based on Edge Detection. Phys. Procedia 2012, 24, 1350–1356. [Google Scholar] [CrossRef] [Green Version]
Zheng, D.; Zhao, Y.; Wang, J. An Efficient Method of License Plate Location. Pattern Recognit. Lett. 2005, 26, 2431–2438. [Google Scholar] [CrossRef]
Gao, Q.; Wang, X.; Xie, G. License Plate Recognition Based on Prior Knowledge. In Proceedings of the 2007 IEEE International Conference on Automation and Logistics, Jinan, China, 18–21 August 2007; pp. 2964–2968. [Google Scholar]
Wu, B.-F.; Lin, S.-P.; Chiu, C.-C. Extracting Characters from Real Vehicle Licence Plates Out-of-Doors. IET Comput. Vis. 2007, 1, 2–10. [Google Scholar] [CrossRef] [Green Version]
Guo, J.-M.; Liu, Y.-F. License Plate Localization and Character Segmentation with Feedback Self-Learning and Hybrid Binarization Techniques. IEEE Trans. Veh. Technol. 2008, 57, 1417–1424. [Google Scholar]
Musaddid, A.T.; Bejo, A.; Hidayat, R. Improvement of Character Segmentation for Indonesian License Plate Recognition Algorithm Using CNN. In Proceedings of the 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, 5–6 December 2019; pp. 279–283. [Google Scholar]
Bismantoko, S.; Rosyidi, M.; Chasanah, U.; Suksmono, A.; Widodo, T. Character recognition for indonesian license plate by using image enhancement and convolutional neural network. Maj. Ilm. Pengkaj. Ind. 2020, 14, 145–152. [Google Scholar] [CrossRef]
Ariff, F.N.M.; Nasir, A.S.A.; Jaafar, H.; Zulkifli, A.N. Character Segmentation for Automatic Vehicle License Plate Recognition Based on Fast K-Means Clustering. In Proceedings of the 2020 IEEE 10th International Conference on System Engineering and Technology (ICSET), Shah Alam, Malaysia, 9 November 2020; pp. 228–233. [Google Scholar]
Li, H.; Wang, P.; Shen, C. Toward End-to-End Car License Plate Detection and Recognition with Deep Neural Networks. IEEE Trans. Intell. Transp. Syst. 2018, 20, 1126–1136. [Google Scholar] [CrossRef]
Avery, R.P.; Wang, Y.; Rutherford, G.S. Length-Based Vehicle Classification Using Images from Uncalibrated Video Cameras. In Proceedings of the 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No. 04TH8749), Washington, DC, USA, 3–6 October 2004; pp. 737–742. [Google Scholar]
Han, D.; Leotta, M.J.; Cooper, D.B.; Mundy, J.L. Vehicle Class Recognition from Video-Based on 3d Curve Probes. In Proceedings of the 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, China, 15–16 October 2005; pp. 285–292. [Google Scholar]
Tan, F.; Li, L.; Cai, B.; Zhang, D. Shape Template Based Side-View Car Detection Algorithm. In Proceedings of the 2011 3rd International Workshop on Intelligent Systems and Applications, Wuhan, China, 28–29 May 2011; pp. 1–4. [Google Scholar]
Petrovic, V.S.; Cootes, T.F. Analysis of Features for Rigid Structure Vehicle Type Recognition. In Proceedings of the BMVC, Kingston, UK, 7–9 September 2004; Kingston University: London, UK, 2004; Volume 2, pp. 587–596. [Google Scholar]
Lu, L.; Huang, H. A Hierarchical Scheme for Vehicle Make and Model Recognition from Frontal Images of Vehicles. IEEE Trans. Intell. Transp. Syst. 2018, 20, 1774–1786. [Google Scholar] [CrossRef]
ZAFIR, I. Two Dimensional Statistical Linear Discrimi-Nant Analysis for Real-Time Robust Vehicle Type Recognition. In Real-Time Image Processing 2007, Proceedings of the SPIE-IS&T Electronic Imaging, San Jose, CA, USA, 28 January–1 February 2007; Kehtarnavaz, N., Carlsohn, M.F., Eds.; SPIE: Bellingham, WA, USA, 2007; Volume 6496, p. 649602. [Google Scholar]
Huang, H.; Zhao, Q.; Jia, Y.; Tang, S. A 2dlda Based Algorithm for Real Time Vehicle Type Recognition. In Proceedings of the 2008 11th International IEEE Conference on Intelligent Transportation Systems, Washington, WA, USA, 3–6 October 2004; pp. 298–303. [Google Scholar]
Rachmadi, R.F.; Purnama, I. Vehicle Color Recognition Using Convolutional Neural Network. arXiv 2015, arXiv:1510.07391. [Google Scholar]
Yuxin, M.; Peifeng, H. A Highway Entrance Vehicle Logo Recognition System Based on Convolutional Neural Network. In Proceedings of the 2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 25–28 May 2019; pp. 282–285. [Google Scholar]
Vishwakarma, S.; Agrawal, A. A Survey on Activity Recognition and Behavior Understanding in Video Surveillance. Vis. Comput. 2013, 29, 983–1009. [Google Scholar] [CrossRef]
Hu, W.; Tan, T.; Wang, L.; Maybank, S. A Survey on Visual Surveillance of Object Motion and Behaviors. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2004, 34, 334–352. [Google Scholar] [CrossRef]
Xiu, W.; Gao, Z.; Liang, W.; Qi, W.; Peng, X. Information Management and Target Searching in Massive Urban Video Based on Video-GIS. In Proceedings of the 2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, 15–17 June 2018; pp. 228–232. [Google Scholar]
Vlachos, M.; Kollios, G.; Gunopulos, D. Discovering Similar Multidimensional Trajectories. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, 26 February–1 March 2002; pp. 673–684. [Google Scholar]
Olsen, L.; Samavati, F.F.; Sousa, M.C.; Jorge, J.A. Sketch-Based Modeling: A Survey. Comput. Graph. 2009, 33, 85–103. [Google Scholar] [CrossRef]
Wang, X.; Tieu, K.; Grimson, E. Learning Semantic Scene Models by Trajectory Analysis. In Computer Vision—ECCV 2006, Proceedings of the European Conference on Computer Vision, Graz, Austria, 7–13 May 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 110–123. [Google Scholar]
Wei, Z.; Liang, C.; Tang, H. Research on Vehicle Scheduling of Cross-Regional Collection Using Hierarchical Agglomerative Clustering and Algorithm Optimization. In Journal of Physics: Conference Series, Proceedings of the 2021 2nd International Conference on Applied Physics and Computing (ICAPC), Ottawa, ON, Canada, 8–10 September 2021; IOP Publishing: Bristol, UK, 2021; Volume 2083, p. 042022. [Google Scholar]
Zhang, Z.; Ni, G.; Xu, Y. Comparison of Trajectory Clustering Methods Based on K-Means and DBSCAN. In Proceedings of the 2020 IEEE International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 6–8 November 2020; Volume 1, pp. 557–561. [Google Scholar]
Santhosh, K.K.; Dogra, D.P.; Roy, P.P.; Chaudhuri, B.B. Trajectory-Based Scene Understanding Using Dirichlet Process Mixture Model. IEEE Trans. Cybern. 2019, 51, 4148–4161. [Google Scholar] [CrossRef] [Green Version]
Bastani, V.; Marcenaro, L.; Regazzoni, C. Unsupervised Trajectory Pattern Classification Using Hierarchical Dirichlet Process Mixture Hidden Markov Model. In Proceedings of the 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Reims, France, 21–24 September 2014; pp. 1–6. [Google Scholar]
Saligrama, V.; Konrad, J.; Jodoin, P.-M. Video Anomaly Identification. IEEE Signal Process. Mag. 2010, 27, 18–33. [Google Scholar] [CrossRef]
Wang, X. Intelligent Multi-Camera Video Surveillance: A Review. Pattern Recognit. Lett. 2013, 34, 3–19. [Google Scholar] [CrossRef]
Xie, G.; Gao, H.; Qian, L.; Huang, B.; Li, K.; Wang, J. Vehicle Trajectory Prediction by Integrating Physics-and Maneuver-Based Approaches Using Interactive Multiple Models. IEEE Trans. Ind. Electron. 2017, 65, 5999–6008. [Google Scholar] [CrossRef]
Deo, N.; Rangesh, A.; Trivedi, M.M. How Would Surround Vehicles Move? A Unified Framework for Maneuver Classification and Motion Prediction. IEEE Trans. Intell. Veh. 2018, 3, 129–140. [Google Scholar] [CrossRef] [Green Version]
Ye, N.; Zhang, Y.; Wang, R.; Malekian, R. Vehicle Trajectory Prediction Based on Hidden Markov Model. KSII Trans. Internet Inf. Syst. 2016, 10, 3150–3170. [Google Scholar]
Choi, S.; Kim, J.; Yeo, H. Attention-Based Recurrent Neural Network for Urban Vehicle Trajectory Prediction. Procedia Comput. Sci. 2019, 151, 327–334. [Google Scholar] [CrossRef]
Ondruska, P.; Posner, I. Deep Tracking: Seeing beyond Seeing Using Recurrent Neural Networks. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
Kaltsa, V.; Briassouli, A.; Kompatsiaris, I.; Hadjileontiadis, L.J.; Strintzis, M.G. Swarm Intelligence for Detecting Interesting Events in Crowded Environments. IEEE Trans. Image Process. 2015, 24, 2153–2166. [Google Scholar] [CrossRef] [PubMed]
Kumar, D.; Bezdek, J.C.; Rajasegarar, S.; Leckie, C.; Palaniswami, M. A Visual-Numeric Approach to Clustering and Anomaly Detection for Trajectory Data. Vis. Comput. 2017, 33, 265–281. [Google Scholar] [CrossRef]
Hamdi, S.; Bouindour, S.; Snoussi, H.; Wang, T.; Abid, M. End-to-End Deep One-Class Learning for Anomaly Detection in Uav Video Stream. J. Imaging 2021, 7, 90. [Google Scholar] [CrossRef]
Zhang, J.; Xu, C.; Gao, Z.; Rodrigues, J.J.; de Albuquerque, V.H.C. Industrial Pervasive Edge Computing-Based Intelligence IoT for Surveillance Saliency Detection. IEEE Trans. Ind. Inform. 2020, 17, 5012–5020. [Google Scholar] [CrossRef]
Zhou, J.T.; Du, J.; Zhu, H.; Peng, X.; Liu, Y.; Goh, R.S.M. Anomalynet: An Anomaly Detection Network for Video Surveillance. IEEE Trans. Inf. Forensics Secur. 2019, 14, 2537–2550. [Google Scholar] [CrossRef]
Samuel, D.J.; Cuzzolin, F. SVD-GAN for Real-Time Unsupervised Video Anomaly Detection; Oxford Brookes University: Oxford, UK, 2021. [Google Scholar]
Lenkei, Z. Crowdsourced Traffic Information in Traffic Management: Evaluation of Traffic Information from Waze. Master’s Thesis, KTH Royal Institute of Technology School of Architecture and Built Environment Department of Transport Science SE-100 44, Stockholm, Sweden, 2018. [Google Scholar]
Zhang, Z.; Han, L.D.; Liu, Y. Exploration and Evaluation of Crowdsourced Probe-Based Waze Traffic Speed. Transp. Lett. 2022, 14, 546–554. [Google Scholar] [CrossRef]
Google-Developers. Developer Guide Distance Matrix API. Available online: https://developers.google.com/maps/documentation/distance-matrix/overview (accessed on 4 February 2023).
Develop Location-Based Services. Available online: www.here.com (accessed on 4 February 2023).
Naiudomthum, S.; Winijkul, E.; Sirisubtawee, S. Near Real-Time Spatial and Temporal Distribution of Traffic Emissions in Bangkok Using Google Maps Application Program Interface. Atmosphere 2022, 13, 1803. [Google Scholar] [CrossRef]
TomTom Car GPS. Latest TomTom GO Series for Drivers. Available online: https://www.tomtom.com/ (accessed on 4 February 2023).
Sharma, A.; Ahsani, V.; Rawat, S. Evaluation of Opportunities and Challenges of Using INRIX Data for Real-Time Performance Monitoring and Historical Trend Assessment; Nebraska, Department of Roads: Lincoln, NE, USA, 2017. [Google Scholar]
Rajeshwari, M.; Rao, C.M. Road Traffic Anamoly Detection Using AI Approach: Survey Paper. In Proceedings of the 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2–4 December 2021; pp. 845–848. [Google Scholar]
Fedotov, V.; Komarov, Y.; Ganzin, S. Optimization of Using Fixed Route Taxi-Buses with Account of Security of Road Traffic and Air Pollution in Big Cities. Transp. Res. Procedia 2018, 36, 173–178. [Google Scholar] [CrossRef]
Shobana, K.; Sait, A.N.; Haq, A.N. RFID Based Vehicle Toll Collection System for Toll Roads. Int. J. Enterp. Netw. Manag. 2010, 4, 3–15. [Google Scholar] [CrossRef]
Zhu, Q.; Liu, Y.; Liu, M.; Zhang, S.; Chen, G.; Meng, H. Intelligent Planning and Research on Urban Traffic Congestion. Future Internet 2021, 13, 284. [Google Scholar] [CrossRef]
Hassouna, F.M.A.; Al-Sahili, K. Environmental Impact Assessment of the Transportation Sector and Hybrid Vehicle Implications in Palestine. Sustainability 2020, 12, 7878. [Google Scholar] [CrossRef]
Rath, M. Smart Traffic Management System for Traffic Control Using Automated Mechanical and Electronic Devices. In IOP Conference Series: Materials Science and Engineering, Proceedings of the International Conference on Mechanical, Materials and Renewable Energy, Sikkim, India, 8–10 December 2017; IOP Publishing: Bristol, UK, 2018; Volume 377, p. 012201. [Google Scholar]
Liang, X.J.; Guler, S.I.; Gayah, V.V. A Heuristic Method to Optimize Generic Signal Phasing and Timing Plans at Signalized Intersections Using Connected Vehicle Technology. Transp. Res. Part C Emerg. Technol. 2020, 111, 156–170. [Google Scholar] [CrossRef]
Li, X.; Sun, J.-Q. Multi-Objective Optimal Predictive Control of Signals in Urban Traffic Network. J. Intell. Transp. Syst. 2019, 23, 370–388. [Google Scholar] [CrossRef]
Armas, R.; Aguirre, H.; Daolio, F.; Tanaka, K. Evolutionary Design Optimization of Traffic Signals Applied to Quito City. PLoS ONE 2017, 12, e0188757. [Google Scholar] [CrossRef] [PubMed]
Ghanim, M.S.; Abu-Lebdeh, G. Real-Time Dynamic Transit Signal Priority Optimization for Coordinated Traffic Networks Using Genetic Algorithms and Artificial Neural Networks. J. Intell. Transp. Syst. 2015, 19, 327–338. [Google Scholar] [CrossRef]
Jia, H.; Lin, Y.; Luo, Q.; Li, Y.; Miao, H. Multi-Objective Optimization of Urban Road Intersection Signal Timing Based on Particle Swarm Optimization Algorithm. Adv. Mech. Eng. 2019, 11, 1687814019842498. [Google Scholar] [CrossRef] [Green Version]
Zhao, H.; He, R.; Su, J. Multi-Objective Optimization of Traffic Signal Timing Using Non-Dominated Sorting Artificial Bee Colony Algorithm for Unsaturated Intersections. Arch. Transp. 2018, 46. [Google Scholar] [CrossRef] [Green Version]
Park, B.; Lee, J. Optimization of Coordinated–Actuated Traffic Signal System: Stochastic Optimization Method Based on Shuffled Frog-Leaping Algorithm. Transp. Res. Rec. 2009, 2128, 76–85. [Google Scholar] [CrossRef]
Hu, T.-Y.; Chen, L.-W. Traffic Signal Optimization with Greedy Randomized Tabu Search Algorithm. J. Transp. Eng. 2012, 138, 1040–1050. [Google Scholar] [CrossRef]
Shi, W.; Yu, C.; Ma, W.; Wang, L.; Nie, L. Simultaneous Optimization of Passive Transit Priority Signals and Lane Allocation. KSCE J. Civ. Eng. 2020, 24, 624–634. [Google Scholar] [CrossRef]
Gao, K.; Zhang, Y.; Sadollah, A.; Lentzakis, A.; Su, R. Jaya, Harmony Search and Water Cycle Algorithms for Solving Large-Scale Real-Life Urban Traffic Light Scheduling Problem. Swarm Evol. Comput. 2017, 37, 58–72. [Google Scholar] [CrossRef]
Thaher, T.; Abdalhaq, B.; Awad, A.; Hawash, A. Whale Optimization Algorithm for Traffic Signal Scheduling Problem. In Intelligent Computing Paradigm and Cutting-edge Technologies, Proceedings of the International Conference on Information, Communication and Computing Technology, Istanbul, Turkey, 30–31 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 167–176. [Google Scholar]
Srivastava, S.; Sahana, S.K. Nested Hybrid Evolutionary Model for Traffic Signal Optimization. Appl. Intell. 2017, 46, 113–123. [Google Scholar] [CrossRef]
Li, Z.; Schonfeld, P. Hybrid Simulated Annealing and Genetic Algorithm for Optimizing Arterial Signal Timings under Oversaturated Traffic Conditions. J. Adv. Transp. 2015, 49, 153–170. [Google Scholar] [CrossRef]
Zeng, K.; Gong, Y.J.; Zhang, J. Real-Time Traffic Signal Control with Dynamic Evolutionary Computation. In Proceedings of the 2014 IIAI 3rd International Conference on Advanced Applied Informatics, Kokura, Japan, 31 August 2014–4 September 2014; pp. 493–498. [Google Scholar]
Vogel, A.; Oremović, I.; Šimić, R.; Ivanjko, E. Improving Traffic Light Control by Means of Fuzzy Logic. In Proceedings of the 2018 International Symposium ELMAR, Zadar, Croatia, 16–19 September 2018; pp. 51–56. [Google Scholar]
Wang, M.; Wu, X.; Tian, H.; Lin, J.; He, M.; Ding, L. Efficiency and Reliability Analysis of Self-Adaptive Two-Stage Fuzzy Control System in Complex Traffic Environment. J. Adv. Transp. 2022, 2022, 6007485. [Google Scholar] [CrossRef]
Jiang, T.; Wang, Z.; Chen, F. Urban Traffic Signals Timing at Four-Phase Signalized Intersection Based on Optimized Two-Stage Fuzzy Control Scheme. Math. Probl. Eng. 2021, 2021, 6693562. [Google Scholar] [CrossRef]
Zaatouri, K.; Ezzedine, T. A Self-Adaptive Traffic Light Control System Based on YOLO. In Proceedings of the 2018 International Conference on Internet of Things, Embedded Systems and Communications (IINTEC), Hamammet, Tunisia, 20–21 December 2018; pp. 16–19. [Google Scholar]
Al-qaness, M.A.; Abbasi, A.A.; Fan, H.; Ibrahim, R.A.; Alsamhi, S.H.; Hawbani, A. An Improved YOLO-Based Road Traffic Monitoring System. Computing 2021, 103, 211–230. [Google Scholar] [CrossRef]
Dave, P.; Chandarana, A.; Goel, P.; Ganatra, A. An Amalgamation of YOLOv4 and XGBoost for Next-Gen Smart Traffic Management System. PeerJ Comput. Sci. 2021, 7, e586. [Google Scholar] [CrossRef]
Sowmya, B. Adaptive Traffic Management System Using CNN (YOLO). IJRASET 2021, 9, 3726–3732. [Google Scholar] [CrossRef]
Gaonkar, N.U. Road Traffic Analysis Using Computer Vision. IJRASET 2021, 9, 2002–2006. [Google Scholar] [CrossRef]
Liu, S.; Wu, G.; Barth, M. A Complete State Transition-Based Traffic Signal Control Using Deep Reinforcement Learning. In Proceedings of the 2022 IEEE Conference on Technologies for Sustainability (SusTech), Corona, CA, USA, 21–23 April 2022; pp. 100–107. [Google Scholar]
Kumar, N.; Mittal, S.; Garg, V.; Kumar, N. Deep Reinforcement Learning-Based Traffic Light Scheduling Framework for SDN-Enabled Smart Transportation System. IEEE Trans. Intell. Transp. Syst. 2021, 23, 2411–2421. [Google Scholar] [CrossRef]
Anirudh, R.; Krishnan, M.; Kekuda, A. Intelligent Traffic Control System Using Deep Reinforcement Learning. In Proceedings of the 2022 International Conference on Innovative Trends in Information Technology (ICITIIT), Kottayam, India, 12–13 February 2022; pp. 1–8. [Google Scholar]
Guo, J.; Cheng, L.; Wang, S. CoTV: Cooperative Control for Traffic Light Signals and Connected Autonomous Vehicles Using Deep Reinforcement Learning. arXiv 2022, arXiv:2201.13143. [Google Scholar]
Chacha Chen, H.W.; Xu, N.; Zheng, G.; Yang, M.; Xiong, Y.; Xu, K.; Li, Z. Toward a Thousand Lights: Decentralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control. In Proceedings of the Thirty-fourth AAAI Conference on Artificial Intelligence (AAAI’20), New York, NY, USA, 7–12 February 2020. [Google Scholar]
Chu, T.; Wang, J.; Codecà, L.; Li, Z. Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1086–1095. [Google Scholar] [CrossRef] [Green Version]
Wang, Y.; Xu, T.; Niu, X.; Tan, C.; Chen, E.; Xiong, H. STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control. IEEE Trans. Mob. Comput. 2020, 21, 2228–2242. [Google Scholar] [CrossRef]
Mittal, U.; Chawla, P. Neuro–Fuzzy Based Adaptive Traffic Light Management System. In Proceedings of the 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 4–5 June 2020; pp. 396–402. [Google Scholar]
Dong, C.; Yang, K.; Guo, J.; Chen, X.; Dong, H.; Bai, Y. Analysis and Control of Intelligent Traffic Signal System Based on Adaptive Fuzzy Neural Network. In Proceedings of the 2019 5th International Conference on Transportation Information and Safety (ICTIS), Liverpool, UK, 14–17 July 2019; pp. 1352–1357. [Google Scholar]
Mir, A.; Hassan, A. Fuzzy Inference Rule Based Neural Traffic Light Controller. In Proceedings of the 2018 IEEE International Conference on Mechatronics and Automation (ICMA), Changchun, China, 5–8 August 2018; pp. 816–820. [Google Scholar]
Dampage, S.U.; Munasingha, T.D.; Gunathilake, W.D.K.; Weerasundara, A.G.; Udugahapattuwa, D.P.D. Adaptive & Coordinated Traffic Signal System. In Proceedings of the 2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, India, 2–4 October 2020; pp. 63–68. [Google Scholar]
Bouktif, S.; Cheniki, A.; Ouni, A. Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning. Sensors 2021, 21, 2302. [Google Scholar] [CrossRef] [PubMed]
Xue, Y.; Feng, R.; Cui, S.; Yu, B. Traffic Status Evolution Trend Prediction Based on Congestion Propagation Effects under Rainy Weather. J. Adv. Transp. 2020, 2020, 8850123. [Google Scholar] [CrossRef]
Chen, Y.; Lv, Y.; Li, Z.; Wang, F.-Y. Long Short-Term Memory Model for Traffic Congestion Prediction with Online Open Data. In Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil, 1–4 November 2016; pp. 132–137. [Google Scholar]
Zhang, D.; Kabuka, M.R. Combining Weather Condition Data to Predict Traffic Flow: A GRU-Based Deep Learning Approach. IET Intell. Transp. Syst. 2018, 12, 578–585. [Google Scholar] [CrossRef]
Essien, A.; Petrounias, I.; Sampaio, P.; Sampaio, S. A Deep-Learning Model for Urban Traffic Flow Prediction with Traffic Events Mined from Twitter. World Wide Web 2020, 24, 1345–1368. [Google Scholar] [CrossRef] [Green Version]
Fathi, M.; Haghi Kashani, M.; Jameii, S.M.; Mahdipour, E. Big Data Analytics in Weather Forecasting: A Systematic Review. Arch. Comput. Methods Eng. 2022, 29, 1247–1275. [Google Scholar] [CrossRef]
Klinjun, N.; Kelly, M.; Praditsathaporn, C.; Petsirasan, R. Identification of Factors Affecting Road Traffic Injuries Incidence and Severity in Southern Thailand Based on Accident Investigation Reports. Sustainability 2021, 13, 12467. [Google Scholar] [CrossRef]

Figure 1. Intelligent-traffic-management-system-based components.

Figure 2. Taxonomy of the article.

Table 1. Challenges in ITMS development.

Challenges	Description
Vehicle shadow	The vehicle blocks the ambient light, which consists of sunlight and skylights. This causes a shadow to be projected below the vehicle. The vehicle shadow gives unnecessary information along with necessary information. As a result, extracting necessary information about moving vehicles, as well as locating and recognizing them, is difficult.
Vehicle occlusion	The mapping of three-dimensional traffic scenes into two-dimensional images at the time of acquisition, which results in the loss of visual information about the vehicles, is what causes vehicle occlusion. Because of this, there is a possibility that doing an accurate analysis of the complex traffic scene may be challenging.
Resolution Change	The pixel size of the image of the moving vehicle varies as it is being gathered in real time by the camera at the moment of acquisition. Because of this, correctly analyzing a moving vehicle is challenging.
Camera coordination	The purpose of multi-camera coordination is to exploit a scene of traffic in order to enhance the output in the form of image quality. This is often accomplished by combining features from many cameras.
Illumination changes, wind, and weather changes	In the process of developing ITMS, three factors that can present challenges include shifts in the lighting conditions (twilight, night, day, and sunny); wind (which shakes the camera); and changes in the weather (rain, snow, and fog).
Intra variations of vehicles	It is a very challenging task to properly analyze a vehicle because of the many internal variances that exist in vehicles, which include length, width, size, and color.

Table 2. Metaheuristics-based traffic signal control system approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[130]	Heuristics (single objective optimization)	Complete enumeration approach	Operational performance: average delay Computation requirements: time per control action Simulator micro-simulation These heuristic solution methods provide the same function but can save processing time by up to 98% when compared to the complete enumeration approach.
[131]	Novel dynamic multi-objective optimization method with traditional genetic algorithm	-	Performance matrix: maximizing system throughput, minimizing vehicle delay, and avoiding spillbacks. Tool: simulate cell transmission model This research involved the examination of three networks with varying levels of complexity. The findings revealed that dynamic control can be effectively employed in various scenarios to attain optimal traffic performance.
[132]	Evolutionary algorithm and ML	-	Simulator: microscopic multi-agent transport simulator (MatSim) Performance matrix: travel time, emissions, and fuel consumption The study intends to enhance traffic flow by coordinating a large number of traffic lights throughout a large area of the city. an evaluation of the algorithm’s parameters through the utilization of the sequential model-based algorithm configuration (SMAC) method.
[133]	Genetic algorithm and ANNs	Real-time genetic algorithm -based on advanced transit signal priority logic, real-time genetic-algorithm-based control without transit signal priority, actuated signal control with and without standard transit signal priority, and fixed-time control with and without standard transit signal priority	Simulator: VISSIM A real-time traffic control algorithm, referred to as D-SPORT (dynamic signal priority optimization), has been developed with the aim of minimizing transit vehicle delays and increasing schedule adherence. According to simulation results, the D-SPORT signal control system reduces traffic delays and stops by 5–90% (varies with congestion and control type) in most scenarios.
[134]	Improved particle swarm optimization algorithm for multi-objective signal optimization	Genetic algorithm direct search toolbox and non-dominated sorting genetic algorithm -II	Performance matrix: per capita delay, vehicle emissions, and intersection capacity Environment: real-world intersection Their proposed method provides more diverse and uniform Pareto solutions compared to NSGA-II and GADST and is faster in computation when run on the same hardware. It is a realistic and successful strategy for optimizing signal delays at urban intersections
[135]	Non-dominated sorting artificial bee colony algorithm	Weighted combination methods, Webster timing, and non-dominated sorting genetic algorithm II.	Performance matrix: vehicle delay and stops The non-dominated sorting algorithm for artificial bee colonies has a higher chance of convergence than the other methods tested. To test how well the proposed method works, a typical intersection in the city of Lanzhou has been chosen.
[136]	Stochastic optimization method based on shuffled frog-leaping algorithm	-	The modified stochastic optimization method technique, stochastic optimization method based on shuffled frog-leaping algorithm, improved network travel times by 3.5% during the middle of the day and by 2.1% during the afternoon peak. The findings of a case study conducted on an arterial network with a total of 16 signalized junctions.
[137]	Greedy randomized tabu search	Genetic algorithm	Numerical analysis in two networks—a test network and a real city network Two main processes are considered- (1) search direction, and (2) performance evaluation The results of the comparison between the greedy randomized tabu search algorithm and the genetic algorithm showed that trip times could be reduced by over 25% for medium and high demand levels.
[138]	Simulated annealing algorithm	-	The simulated annealing approach solved mix-integer-nonlinear-programming. The sensitivity analysis shows that the recommended approach may provide less-than-ideal solutions for a range of vehicle demand, bus demand, and left turn ratio combinations.
[139]	Modified JAYA and water cycle algorithm with feature-based search strategy	Existing traffic control systems	Environment: real traffic data of Singapore for evaluation. Instance: 11 cases of traffic networks The achieved optimization results demonstrated that the applied metaheuristics are superior to the current traffic control system. The improvements ranged from over 26% to 28% in terms of the lowest and highest total delay durations, respectively. The study also shows that the WCA algorithm outperformed the HS and Jaya algorithms in terms of statistical optimization results for large-scale urban traffic light scheduling problems. The HS and Jaya algorithms were more effective for smaller scenarios. The WCA also required less computational time than the HS and Jaya algorithms.
[140]	Whale optimization algorithm	Genetic algorithm	Three case studies showed the whale optimization algorithm is more successful than the genetic algorithm with respect to estimating average travel time.

Table 3. Hybrid metaheuristics-based traffic signal control system approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[141]	Hybrid ant colony optimization and genetic algorithm methods	Conventional ant colony optimization and genetic algorithm approaches	The implementation was carried out in two stages, the first with only Layer 1, and the second with a combination of Layers 1 and 2. Numerical experiments show that the hybrid model outperforms ant colony optimization and genetic algorithms in terms of wait time for different test cases.
[142]	Hybrid simulated annealing and a genetic algorithm	Conventional simulated annealing and genetic algorithm approaches	Performance comparison: CPU time vs. objective function value. The method for decoding signal timing is based on the NEMA phase structure. Results from experimentation showed that combining simulated annealing and genetic algorithms improved performance compared to using each method alone, in terms of both solution quality and convergence speed.
[143]	Collaborative evolutionary-swarm optimization	Particle swarm optimization method	Simulator: simulator of urban mobility (SUMO) The experiments showed that the collaborative evolutionary-swarm optimization algorithm is better than the particle-swarm optimization approach in terms of average delay time.

Table 4. Fuzzy-logic-based traffic signal control system approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[144]	Adaptive traffic light controller based on fuzzy logic	Fixed signal program	Their proposed fuzzy control system has two parts: one for the primary driveway, where there are a lot of vehicles, and one for the secondary driveway, where there are not as many vehicles. Traffic parameters: average queue length, average maximum queue length, average number of vehicle stops. Microscopic simulator: VISSIM The fuzzy control system proposed is compared to a fixed signal programmed in three traffic situations. The results show how well decision rules perform.
[145]	Self-adaptive, two-stage fuzzy controller	Traditional control method	Simulation platform utilizing VISSIM and the Python language. The system consists of two stages: (1) selecting the red phase with the highest traffic urgency as the next green phase, and (2) deciding whether to extend or end the current signal phase. Results show an improvement in efficiency compared to traditional control, with a 14.59% decrease in average delay time per vehicle and a 0.7% decrease in the average number of stops. The controller effectively increases the capacity of the intersection and works well in medium traffic density and fluctuating flow conditions.
[146]	Two-stage FL controller	Traditional fuzzy controller, fixed-time controller, and fuzzy controller without flow prediction	The simulation used information from a single intersection on Huagang Road in Nanjing, Jiangsu Province, China. MATLAB is used for conducting simulations. The proposed method significantly reduced vehicle delays.

Table 5. Convolutional neural network (CNN)-based traffic signal control system approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[147]	Deep CNNs called YOLO.	-	Parameters: queue length and waiting time per vehicle. YOLO, a real-time object recognition system based on deep CNNs, optimizes traffic signals to allow as many vehicles to pass safely with the least amount of waiting time.
[148]	Combination of the neural network, image-based tracking, and YOLOv3	Traditional method	Estimations on daytime video, winter video, and night video based on detections in each frame, classification of vehicles, vehicles counted, and intersection over union. Their proposed model, which is used in this paper and combines a neural network, image-based tracking, and YOLOv3, is a cost-effective and hardware-efficient alternative to the previous model.
[149]	An amalgamation of YOLOv4 and XGBoost	-	Analytical crossroads undertaken from Vadodara City In this study, four regression models are compared: elastic net, support vector machine regression (SVR), random forest regression, and extreme gradient boosting tree-based (XGBoost GBT). You Only Look Once v4 and the XGBoost algorithms balance inference time and accuracy to give the most accurate results. It works in any weather and under low street lights, day and night.
[150]	Video-based counting technique using YOLO	Impractical method of inductive loops	Pygame was used to build the simulation from the ground up. Their proposed approach simplifies, enhances accuracy, and provides early detection of traffic congestion, leading to highly accurate results. It performs excellently on Indian roads and offers cost savings, time savings, and reduced infrastructure costs compared to the costly and unrealistic method of using inductive loops.
[151]	YOLO and simple online and real-time tracking algorithm	-	The study demonstrates a video-based vehicle counting method used on a highway captured by a CCTV camera. The approach involves detecting vehicles using YOLO and tracking them using the SORT algorithm.

Table 6. Reinforcement-learning-based traffic signal control system approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[152]	Deep reinforcement learning-based traffic signal control method	Fixed-time and actuated traffic signal control	The positions and speeds of vehicles, obtained from either V2I, roadside sensing, or drone-based surveillance, are analyzed by a convolutional neural network (CNN). In this study, the processed information is then used as inputs in the reinforcement learning (RL) system. A dual-ring mechanism has been introduced to allow for flexible traffic signal control through a complete state transition process. Simulator: SUMO: Simulation of urban mobility. The study found that the deep reinforcement learning technique has the potential to reduce average wait times by 34.7% and decrease pollutant emissions by 18.5%.
[153]	SDDRL (deep reinforcement learning + software defined networking)	Deep Q network, fuzzy inference based dynamic traffic light control systems: fixed traffic light control system and novel fuzzy model, maxpressure based dynamic traffic light control systems: max-pressure algorithm and fixed-time based dynamic traffic light control systems: fix time algorithm	The results of the comparison show that the proposed solution improves a number of performance metrics such as average waiting time, throughput, average queue length, and average speed by a range of 28.34% to 66.62%, 24.76% to 66.60%, 30.89% to 69.80%, and 16.62% to 43.67%, respectively, over other methods that are considered to be state-of-the-art.
[154]	Distributional reinforcement learning with quantile regression (QR-DQN) algorithm	Static signaling, longest queue first, and n-step SARSA	A new control strategy is put in place that gives different weights to the risk of a decision depending on how busy the system is. This makes the network less crowded. Performance matrix: queue length, vehicle waiting time, and journey Time loss They applied the recently developed deep reinforcement learning method to the problem of managing traffic and showed that it worked much better than more traditional ways of controlling traffic lights.
[155]	A multi-agent deep reinforcement learning system called CoTV	Flow connected autonomous vehicles, presslight, baseline	Simulation: SUMO Performance matrix: travel time and delay, environmental indicators, and traffic safety COTV has been evaluated using grid maps and realistic urban areas. The COTV may save 28% on fuel and CO2 emissions and 30% on travel time compared to the baseline. The CoTV system has been found to effectively reduce travel time, lower fuel consumption and CO₂ emissions, and increase time-to-collision.
[156]	MPLight as a typical Deep Q-Network agent	MaxPressure, FixedTime, graph reinforcement learning, graph convolutional neural, PressLight, NeighborRL, FRAP	In a real-world situation with 2510 traffic signals in Manhattan, New York City, MPlight’s travel time and throughput matrix performed better. Performance matrix: travel time and throughput along with four configurations In a real-world situation with 2510 traffic signals in Manhattan, New York City, MPlight’s travel time and throughput matrix performed better. This study evaluates the performance of various reinforcement learning (RL)-based methods in the context of a Manhattan network, both with and without the presence of “pressure”.
[157]	Multi-agent advantage actor critic	Greedy, independent advantage actor critic, independent Qlearning–reinforcement learning, independent Qlearning—deep neural networks	MARL-based ATSC is evaluated in two SUMO-simulated traffic environments. The comparison is conducted on both a synthetic traffic grid and a real-world traffic network in Monaco City during simulated peak-hour traffic conditions. Performance matrix: reward, avg. queue length [veh], avg. intersection delay [s/veh], avg. vehicle speed [m/s], trip completion flow [veh/s], and trip delay [s] The results show that the proposed multi-agent A2C method is optimal, robust, and efficient in comparison to other state-of-the-art decentralized Multi-Agent Reinforcement Learning (MARL) algorithms.
[158]	A spatio-temporal multi-agent reinforcement learning approach	Max-Plus, neighbor reinforcement learning, graph convolutional neural-lane, graph convolutional neural-inter, colight, MaxPressure	Statistics of the real-world traffic datasets: arrival rate (vehicles/300 s) and time range Synthetic and real-world data experiments show that spatio-temporal multi-agent reinforcement learns the usefulness of multi-intersection traffic signals as compared to existing methods.

Table 7. Hybrid-Based Traffic Signal Control System approach along with paper highlights.

Ref	Proposed Method That Is Used in Paper	Compared Method	Highlights
[159]	Combined neural networks and FL systems	Fuzzy inference system and fixed timer-based system	Parameters: inflow rate, number of waiting vehicles at current lane, number of waiting vehicles at adjacent lane, priority vehicle present (flag), lane on which priority vehicle present, fixed timer system output (in seconds), fuzzy system output (in seconds), ANFI S output t (in second ds), lane to be served by fixed timer system, lane to be served by ANFIS. MATLAB was used for simulation. Their proposed technique reduced vehicle wait times at the intersection and improved traffic flow.
[160]	Adaptive fuzzy neural network	Traditional fixed-timing signal	Parameters: transmission range; the proportion of vehicles (turn left; straight; turn right), the proportion of vehicles (small; medium; oversize); the weight of vehicles; the length of vehicles; the shortest green light time; the longest green light time, vehicle safety distance; the maximum speed; the maximum acceleration; Performance matrix: average number of stops, average delay time, average queue length, and average fuel consumption Adaptive control, according to the study, reduced average delay time by 8.45% and fuel consumption by 24.0%.
[161]	Combined neural networks and FL systems	-	Environment: MATLAB simulations. The trained neural traffic controller was tested with a data set that included arrival and queue indexes. Their proposed neural traffic light controller is capable of managing congestion far better than a conventional traffic light control system.
[162]	YOLOv3-tiny, OpenCV, and deep Q network-based coordinated system	Static traffic light system	Their proposed system, which is both adaptive and coordinated in nature, aims to reduce traffic congestion by increasing the mean vehicular speed. Compared to a traditional traffic light system, when there are multiple intersections, the average speed goes up by 18%.
[163]	Customized a parameterized deep Q-Network (P-DQN) architecture	Fixed-time, discrete approach, continuous approach	Performance matrix: average travel time (ATT), queue length (QL), and the average waiting time of vehicles (AWT). The problem is addressed in this study by employing a hybrid reinforcement learning variant, specifically tailoring the multi-pass DQN hybrid parameterized deep Q-networks to jointly control the TSC phase and timing. This results in a decrease of 22.20% in average queue length and 5.78% in travel time.

Table 8. Features of simulators.

Software/Feature	Macro/Micro/Meso	Pedestrians and Vehicle Types	Very Large Networks	GUI	Simulation Output	Creating Network	Documentation and UI	OS Portability	Open Source and Free Use
Visum	Macro	No	No	3D	Tools	Graphical editor	Yes	No	Yes
Vissim	Micro	Yes	Yes	3D	Tools	Graphical editor	Yes	No	Yes
Sumo	Micro	Yes	Yes	2D	Files	Xml files	Yes	Yes	Yes
Saturn	Macro	No	No	2D	Files	OD matrix	Yes	Yes	No
Transmodeller	Meso	Yes	Yes	Both	Real-time tools	Limited flexibility	Yes	No	No
Mitsimlab	Micro	No	No	2D	Files	Manual	Yes	Yes	Yes
Aimsun	Meso	Yes	Yes	3D	Real-time tools	Graphical editor	Yes	No	No
Corsim	Micro	Yes	Yes	2D	Real statistical tools	No	Yes	No	No
Dracula	Micro	Yes	No	2D	Files	Graphical	Yes	No	No
Mainsim	Micro	Yes	Yes	2D	Files	OD matrix	Yes	Yes	Yes
Matsim	Micro	Yes	No	2D	Files	Manual-xml file	Yes	Yes	Yes
Q-Paramics	Micro	Yes	Yes	3D	Tools	Wizard	Yes	No	No
Transims	Micro	No	Yes	2D	Files	Manual	Limited	No	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nigam, N.; Singh, D.P.; Choudhary, J. A Review of Different Components of the Intelligent Traffic Management System (ITMS). Symmetry 2023, 15, 583. https://doi.org/10.3390/sym15030583

AMA Style

Nigam N, Singh DP, Choudhary J. A Review of Different Components of the Intelligent Traffic Management System (ITMS). Symmetry. 2023; 15(3):583. https://doi.org/10.3390/sym15030583

Chicago/Turabian Style

Nigam, Nikhil, Dhirendra Pratap Singh, and Jaytrilok Choudhary. 2023. "A Review of Different Components of the Intelligent Traffic Management System (ITMS)" Symmetry 15, no. 3: 583. https://doi.org/10.3390/sym15030583

APA Style

Nigam, N., Singh, D. P., & Choudhary, J. (2023). A Review of Different Components of the Intelligent Traffic Management System (ITMS). Symmetry, 15(3), 583. https://doi.org/10.3390/sym15030583

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Different Components of the Intelligent Traffic Management System (ITMS)

Abstract

1. Introduction

2. Image Acquisition Component

2.1. Traffic Scene Regions for Image Acquisition

2.2. Imaging Technologies

3. Utilization of Static and Dynamic Attributes

3.1. Vehicle Detection

3.1.1. Detection of Vehicles Based on Appearance

3.1.2. Detection of Vehicles Based on Motion

3.2. Vehicle Tracking

3.3. Vehicle Recognition

3.3.1. Vehicular License Plate Recognition

3.3.2. Vehicle Shape and Appearance

3.3.3. Vehicle Color

3.3.4. Vehicle Logo

4. Vehicle Behavior Understanding

4.1. Track Vehicles Using Single Camera

4.1.1. Behavior Understanding Using Trajectory Analysis

4.1.2. Behavior Understanding Using without Trajectory Analysis

4.2. Tracking Vehicles Using Multiple Cameras

5. Traffic Software Applications in ITMS

6. ITMS Applications

7. Traffic Signal Control Systems (TSCSS)

7.1. Metaheuristics-Based Traffic Signal Control System Approach

7.2. Hybrid Metaheuristics-Based Traffic Signal Control System Approach

7.3. Fuzzy-Logic-Based Traffic Signal Control System Approach

7.4. Convolutional Neural Network (CNN)-Based Traffic Signal Control System Approach

7.5. Reinforcement-Learning-Based Traffic Signal Control System Approach

7.6. Hybrid-Based Traffic Signal Control System Approach

8. Simulator

9. Discussion

9.1. Enhancing System Efficiency

9.1.1. Dealing with Occlusions

9.1.2. Dealing with Types of Vehicle and Their Position Changes

9.1.3. Adapting to Variations in Illumination

9.1.4. Telematics and CVISs Are Being Used for Surveillance Variations

9.2. Surveillance System on a Network

9.3. Comprehensive Knowledge of Traffic Scenes

9.4. Role of Vehicle Spatial Occupancy

9.5. ITMS Integrated with Online Weather Data

9.6. ITMS Integrated with Weather Forecasts

9.7. ITMS Integrated with Incident Reports

9.8. Strategies Needed for Developing Efficient TSCSs

10. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

7. Traffic Signal Control Systems (TSCS_S)