Comparative Trafﬁc Flow Prediction of a Heuristic ANN Model and a Hybrid ANN-PSO Model in the Trafﬁc Flow Modelling of Vehicles at a Four-Way Signalized Road Intersection

: The accurate and effective prediction of the trafﬁc ﬂow of vehicles plays a signiﬁcant role in the construction and planning of signalized road intersections. The application of artiﬁcially intelligent predictive models in the prediction of the performance of trafﬁc ﬂow has yielded positive results. However, much uncertainty still exists in the determination of which artiﬁcial intelligence methods effectively resolve trafﬁc congestion issues, especially from the perspective of the trafﬁc ﬂow of vehicles at a four-way signalized road intersection. A hybrid algorithm, an artiﬁcial neural network trained by a particle swarm optimization model (ANN-PSO), and a heuristic Artiﬁcial Neural Network model (ANN) were compared in the prediction of the ﬂow of trafﬁc of vehicles using the South Africa transportation system as a case study. Two hundred and ﬁfty-nine (259) trafﬁc datasets were obtained from the South African road network using inductive loop detectors, video cameras, and GPS-controlled equipment. For the ANN and ANN-PSO training and testing, 219 trafﬁc data were used for the training, and 40 were used for the testing of the ANN-PSO model, while training (160), testing (40), and validation (59) was used for the ANN. The ANN result presented a logistic sigmoid transfer function with a 13–6–1 model and a testing R 2 of 0.99169 compared to the ANN-PSO result, which showed a testing performance of R 2 0.99710. This result shows that the ANN-PSO model is more efﬁcient and effective than the ANN model in the prediction of the trafﬁc ﬂow of vehicles at a four-way signalized road intersection. Furthermore, the ANN and ANN-PSO models are robust enough to predict trafﬁc ﬂow due to their better testing performance. The modelling approaches proposed in this study will assist transportation engineers and urban planners in designing a trafﬁc control system for trafﬁc lights at four-way signalized road intersections. Finally, the results of this research will assist transportation engineers and trafﬁc controllers in providing trafﬁc ﬂow information and travel guidance for motorists and pedestrians in the optimization of their travel time decision-making.


Introduction
In developed and developing countries, traffic congestion at signalized road intersections has become a central issue. Efficient and effective traffic flow prediction in road transportation is one of the most fundamental characteristics of smart cities and intelligent transportation systems [1]. It is imperative to transportation researchers and pedestrians [1]. Having up-to-date traffic flow information for traffic congestions on freeways and knowing the level of the traffic volume of vehicles at road intersections in advance plays an important role in assisting transportation and civil engineers in developing and

Motivation and Contribution of the Research
The primary reason why a hybrid ANN-PSO was used in this study was that [12] stated that PSO can be defined as an optimization technique that performs a rapid convergence to optimal performances. This characteristic is desirable when evaluating different traffic conditions (traffic flow, traffic density, and vehicular speed). Besides this, a canonical PSO algorithm is easy to use and requires very few adjustment parameters. The main objective of this study was to carry out an extensive comprehensive analysis of the prediction performance of the traffic flow of vehicles by comparing the traffic flow prediction performance of a heuristic ANN model and a hybrid ANN-PSO model. Another primary objective of this research study was to examine the emerging role of soft computing techniques (ANN and ANN-PSO) in the context of traffic flow modelling at a signalized road intersection. This research study will provide a significant opportunity to advance the understanding of the application of an artificial neural network model (ANN) and an artificial neural network trained by particle swarm optimization (ANN-PSO) in the modelling of the traffic flow of vehicles at a four-way signalized road intersection. Another significant contribution of this research is that it will assist various developed and developing countries to advance their traffic management techniques in curbing traffic congestion at road intersections.

Organization of the Research
This paper has been divided into five parts. The first part deals with the overview and significance of the study. The second part begins by laying out the theoretical dimensions of the research and looks at related studies on ANN and ANN-PSO. The third part is concerned with the methodology used in this research study and a detailed explanation of the traffic control delay at a four-way signalized road intersection. The fourth part presents the findings of the research. The final part draws upon the entire paper, tying up the various theoretical and empirical findings to contribute to the field of transportation, and areas for further research are identified.

Related Studies
In the last few years, transportation researchers have carried out a lot of research on the occurrence of traffic congestion in road transportation and the prediction of the traffic flow at various road networks. However, few researchers have drawn on any structured research into traffic flow prediction at signalized road intersections using hybrid and heuristic predictive models. Previous studies by D'Andrea and his co-worker Marcelloni created an expert system for detecting traffic congestions at various road networks by using traffic data that comprises the past and current vehicular speed [13]. Related research to [13] was proposed by [14], in which a method called "scalable" was used to predict the traffic congestion of vehicles in a grid framework. Anwar and co-workers applied a spectral clustering-based method to supervise traffic congestions [15]. Considering the traffic flow density and different types of roads, Liang and co-workers developed a novel prediction model capable of estimating the next-time step traffic volume using a single road traffic segment to clarify traffic congestions using traffic flow variables such as the current inflow, outflow, and traffic volume, etc. [16].
However, the research carried out by Xiangjie and co-workers improved the model of [16] by using a support vector machine (SVM) for the prediction of the next timestep traffic speed and traffic volume and used it in the estimation of traffic congestion of segments roads [17]. Researchers such as [18] proposed a specialized density-based spatial clustering application (DBSCAN) using a noise algorithm. This was developed for the detection and analysis of a consistent congested cluster of grids. They investigated a deep-learning-based prediction model using a restricted Boltzmann Machine and a Recurrent Neural Network to predict the traffic flow at congested roads [18]. A practical traffic flow parameter prediction model was created for traffic flow conditions estimations. An autoregressive model was combined with other predictive models [17,19]. In their research, [20] developed a model combining artificial neural networks and root mean squared error. Both were used as a metric by applying singular point probabilities. Traffic congestion has become a global pandemic that transportation researchers are racing against time to improve the effectiveness of intelligent transportation systems. Some researchers have been able to achieve good results when it comes to traffic flow prediction. Traffic flow prediction techniques are categorized into: • Traditional statistical techniques.

•
Traditional machine learning techniques. • Deep learning methods.
Traditional statistical techniques comprise the historical average method (HA) and a statistical technique called Autoregressive Integrated Moving Average (ARIMA) [2]. Subsequently, the features of the ARIMA model consist of the combination of several models, Sustainability 2021, 13, 10704 4 of 28 such as ARIMA time series models (KARIMA) [21] and the Seasonal Autoregressive Integrated Moving Average (SARIMA) [22]. However, the major disadvantage of this type of model is the limitation in the processing capacities in terms of non-linear and challenging traffic flow data [23].
Compared with the above traditional models, traditional machine learning techniques can efficiently model complex non-linear traffic data. Typical examples are SVM [4,24,25] and SVR [5,26]. These traditional models can map low-dimensional non-linear data to high-dimensional space using kernel functions to evaluate traffic data characteristics for prediction. However, the selection of the kernel function is a primary determinant affecting the performance of predictive models. Apart from Bayesian models [7], K nearest neighbours [8,27,28] and Artificial Neural Networks (ANN) [29] have been applied for the prediction of traffic flow. The significant drawback of traditional machine learning is their reliance on engineering and the experience of experts [30]. However, for these traditional methods, it is complex to improve the efficiency of these predictive models when processing and evaluating complex and highly non-linear data [3,31]. Currently, deep learning techniques in transportation have yielded good results, especially in image processing and natural language processing [32].
Nowadays, transportation researchers are applying deep learning methods in traffic data mining using temporal and spatial correlation. Previous research performed by [6,33], in which they applied Deep Belief Networks (DBN) and Stacked Autoencoder Models (SAEs) to extend and deepen the network layers for the learning of the features in traffic flow data. Then, researchers such as [34] applied the combination of traffic flow and weather information to enhance the predictive performance of the DBN model. Models such as Long Short-Term Memory (LSTM) [35][36][37][38], Gated Recurrent Unit Network (GRU) [39], and Nonlinear Autoregressive with External Input (NARX) [36] were applied for the temporal correlation of traffic flow data to improve the traffic flow prediction. However, these predictive models failed to consider the spatial relationship in the structure of the traffic network. Even though Convolutional Neural Networks (CNNs) [40][41][42] have made significant headway in the field of vision, transportation researchers went further in applying CNN to traffic flow prediction to capture local spatial characteristics. Hence, [43] suggested Deep Spatio-Temporal Residual Networks (STResNet) to predict the flow of people in a transportation system. Few recent surveys have comprehensive literature reviews on traffic flow prediction in specific contexts from various perspectives of road transportation, especially from the traffic flow of vehicles at road intersections. For example, [44] investigated the techniques and applications from the past decade and explained in detail the ten challenges and issues experienced by pedestrians and motorists in terms of traffic flow. The investigations carried out by [44] were more aimed at considering short-term traffic flow prediction. The literature reviews involved were primarily dependent on the conventional methods of traffic flow prediction. Another piece of research by [45] focused on the prediction of short-term traffic flow by summarizing the methods applied in the prediction of traffic flow. They also made some cogent suggestions for future research.
Furthermore, research carried out by [46] explained, in detail, how to acquire traffic data and aimed their research at conventional machine learning techniques. In addition to these, [47] indicated the contributions and research frameworks of traffic flow prediction. The research carried out by [48][49][50][51] summarized the applicable models that depend on conventional techniques and some early deep learning techniques. Alexander et al. [52] outlined a comprehensive survey of deep neural networks to predict the traffic flow of vehicles. Their research discussed three well-known deep neural architectures comprising convolutional, recurrent, and feed-forward neural networks. However, some recent technological innovations involving graph-based deep learning were not discussed in their research [52]. Likewise, researchers such as [53] investigated a well-detailed survey of graph-based deep learning architecture, including their applications in the field of traffic flow. Furthermore, [54], in their research, outline a survey aimed at applying deep learning Sustainability 2021, 13, 10704 5 of 28 models in the evaluation and analysis of traffic flow data. However, their research neglects to focus on other areas of road transportation. They only carried out their investigations on the prediction of traffic flow. In general, there is other research on the prediction of traffic flow in road transportation that possesses standard features. It is always advantageous to consider all of the areas of traffic flow. Therefore, there is still insufficient research that contributes to traffic flow prediction, especially when it comes to traffic flow prediction using heuristics and nature-inspired algorithms.
Comparing different model specifications shows that testing results are significant in supporting the usefulness of a proposed prediction model. For example, [3] investigated the usefulness and effectiveness of recent comparative research based on short-term traffic flow forecasting. They stated that not all model comparisons are efficient, especially when comparing a complex non-linear model and a simple linear model. In addition, there exists an almost non-existent difference between the accuracy, simplicity, and suitability of a model (Occam's razor). In their research, [55] recommend that as much as model accuracy is very significant, it shouldn't only be used as a yardstick in determining the appropriate methodology for the prediction of the traffic flow of vehicles. Other challenges, such as time and effort, should be considered when determining the development of the model, techniques, and expertise, resulting from the transferability and suitability to changes in the temporal behaviour of traffic flow [55][56][57].
Even though choosing the "best" model in a group of baseline models using testing and comparison is significant, there is a need for a practical option to select a heuristic or metaheuristic approach to combine traffic flow predictions. The combination of predictive models may not likely result in a single well-specified model. A well-known case is the forecasting of complex traffic datasets. Different researchers in traffic flow forecasting have carried out this approach of combining predictive models; [58] carried out research in which they offered statistical guidelines for traffic flow by dynamically shifting between different models. The only disadvantage of their research is that they did not provide combined forecasts of traffic flow. Furthermore, [59] researched the combination of traffic flow forecasts from two neural networks by applying the Bayesian rule. In their research, [60] investigated the combination of traffic flow predictions from various types of predictive models, while [61] applied a fuzzy logic model to combine traffic flow forecasts. The research of [62] was based on combining forecasts from three models by applying neural networks.

Traffic Flow Patterns at a Signalized Road Intersection
This subsection describes the use of a time-space diagram ( Figure 1) to explain the traffic flow patterns at a four-way signalized road intersection.
When drivers arrive at a signalized road intersection, the driver's response to traffic lights is important in understanding the traffic flow patterns at a road intersection, i.e., the response of drivers when the traffic lights turn red, the beginning of the traffic signal interval when the traffic lights turn green, and the queue of the vehicles clearing from the road intersection without any traffic control delays. This process continues back and forth from traffic lights turning to red, then to green, and back to yellow, then to red again. These are the basic concepts behind the traffic flow of vehicles at signalized road intersections. To explain these concepts efficiently, we are going to use a time-space diagram. Some assumptions were made trying to explain these time-space diagrams. These assumption diagrams can be found in the book written by [63].

Assumption 1
Let us assume that three vehicles are traveling at a uniform speed and are approaching a signalized road intersection. The "space" between the vehicles and the road intersection is shown on the y-axis, while the time is on the x-axis. The three circles display the traffic lights. These traffic lights can be either green, yellow, or red, depending on real-time traffic flow. When drivers arrive at a signalized road intersection, the driver's response to traffic lights is important in understanding the traffic flow patterns at a road intersection, i.e., the response of drivers when the traffic lights turn red, the beginning of the traffic signal interval when the traffic lights turn green, and the queue of the vehicles clearing from the road intersection without any traffic control delays. This process continues back and forth from traffic lights turning to red, then to green, and back to yellow, then to red again. These are the basic concepts behind the traffic flow of vehicles at signalized road intersections. To explain these concepts efficiently, we are going to use a time-space diagram. Some assumptions were made trying to explain these time-space diagrams. These assumption diagrams can be found in the book written by [63].

Assumption 1
Let us assume that three vehicles are traveling at a uniform speed and are approaching a signalized road intersection. The "space" between the vehicles and the road intersection is shown on the y-axis, while the time is on the x-axis. The three circles display the traffic lights. These traffic lights can be either green, yellow, or red, depending on realtime traffic flow.

Assumption 2
These three vehicles have been traveling at a uniform speed. These vehicles' trajectories are parallel and linear. The traffic lights turn red as these vehicles reach the road intersection.

Assumption 3
As the traffic lights turned red, the three vehicles approaching the intersections had to stop, and their speed dropped drastically. Two incoming vehicles meet the three vehicles at the road intersection, making it five vehicles in a queue at the intersection. Deceleration has occurred, and the vehicular speed is zero. In Assumption 3, as the speed of the vehicle drops due to the traffic lights turning red, the duration of time spent at the road intersection increases.

Assumption 2
These three vehicles have been traveling at a uniform speed. These vehicles' trajectories are parallel and linear. The traffic lights turn red as these vehicles reach the road intersection.

Assumption 3
As the traffic lights turned red, the three vehicles approaching the intersections had to stop, and their speed dropped drastically. Two incoming vehicles meet the three vehicles at the road intersection, making it five vehicles in a queue at the intersection. Deceleration has occurred, and the vehicular speed is zero. In Assumption 3, as the speed of the vehicle drops due to the traffic lights turning red, the duration of time spent at the road intersection increases.

Assumption 4
As the traffic lights turn green, the vehicles already waiting in a queue at the road intersection start accelerating and driving into the intersection.

Assumption 5
The vehicles arriving at the road intersections after the queue has cleared will be delayed, as the traffic lights are still green.

Assumption 6
This is when vehicles arrive at the road intersections when the traffic lights turn yellow. Their speed gradually reduces as they drive towards the road intersection, as the traffic lights can turn red anytime.

Assumption 7
Now that the traffic lights have turned red, the incoming vehicles must stop and adhere to this traffic control delay and form a new queue.
This is called the "traffic shockwaves" of the queues of vehicles forming at a road intersection when the traffic lights turn red.

2.
This is a traffic shockwave of vehicles when the traffic lights turn green.

3.
This is a traffic control delay for each vehicle at the intersection. This is the arrival time when vehicles arrive at a road intersection and when they leave the intersection.

4.
This is when two vehicles depart at the same time from the road intersection. It is called "saturation headway".

5.
This is the speed of the vehicles as they arrived at and departed from the road intersection. 6. This is called the time gap. It usually occurs between the departing vehicle and the arriving vehicle.

Assumption 9
The driver responses at signalized road intersections are shown in the Assumption 9 diagrams using figures.

1.
The driver stopped because the traffic light was red.

2.
This is the driver driving through the intersection when the traffic light is green.

3.
This is the driver driving through the intersection when the queue is cleared and no vehicles are waiting at the road intersection.

4.
This is the driver reducing their speed because the traffic light has turned green.

Methodology
The workflow of the methodologies used in this research is shown in Figure 2.

Research Design
This research was designed to determine the ways in which issues of traffic congestion can be addressed from the perspective of the traffic flow at four-way signalized road

Research Design
This research was designed to determine the ways in which issues of traffic congestion can be addressed from the perspective of the traffic flow at four-way signalized road intersections. This was achieved by designing a classical and generic traffic flow model for signalized road intersections, considering the selected metropolitan section of Gauteng province in South Africa. This research focused mainly on the qualitative and the quantitative techniques of addressing traffic congestion issues.

Population of the Research
One of the very few companies known for traffic monitoring solutions and traffic safety is the study's population. The company is known as Mikros Traffic Monitoring (MTM) Company, a subsidiary of the Syntell group of companies. This company works in conjunction with the South Africa Ministry of Transportation and South Africa National Roads Agency Limited (SANRAL).

Size of Traffic Data
The size of the dataset considered in this study is limited to 259 traffic datasets obtained from MTM, focusing on a four-way signalized road intersection within the investigation period.

Method of Traffic Data Collection
The technique used for the traffic data collection comprises primary and secondary techniques. The primary technique used in this study comprises the collection of traffic flow data from South African four-way signalized road intersections using inductive loop detectors, video cameras, and road-wide stationed GPS-controlled equipment. The secondary data has to do with direct visits to the Mikros Traffic Monitoring (MTM) Company and interaction with the strategic and operational staff of MTM to obtain information on traffic flow situations at various intersections.

Sample and Sampling Methods
Sampling is defined as the selection of a subcategory of samples from a statistical population to evaluate the traffic dataset. A fraction of two hundred and fifty-nine (259) datasets were selected to evaluate the traffic data's entirety, representing the vehicles manoeuvring at a four-way signalized road intersection in the Gauteng province within the investigation period. The traffic engineers in the South Africa Ministry of Transportation carried out the data cleaning on the traffic datasets to remove any duplication or unwanted traffic data.

Location of the Study
The input and output variables used for the ANN and ANN-PSO network are shown in Table 1. This input and output variable approach was based on the approach used by [64,65]. The preparation of the traffic dataset is followed by the structuring of the architecture of the algorithms. MATLAB user interface tools and command-line functionality were used to oversee the ANN and ANN-PSO models' development, training, and testing. The 259 traffic datasets used in this traffic prediction study were obtained from the N1: Allandale Interchange (Figure 3). This N1 interchange during the traffic flow peak period accommodates more than 90,000 automatic and manually driven vehicles traveling southbound and northbound and over 72,000 vehicles moving northbound every day. The N1 Allandale Interchange is a South African Government (National) road network that links Johannesburg through Pretoria, Bloemfontein, Polokwane, Capetown, and Beit Bridge. The traffic flow variables used for the development of the ANN and ANN-PSO model are listed in Table 1 and Figure 4 below:

Traffic Control Delay at Four-Way Signalized Road Intersections
To determine the control delay of vehicles at a road intersection, the queuing theory is used. Let us assume that we have an accumulation of vehicles at a road intersection waiting for the traffic lights to turn green. The accumulation of these vehicles at the road intersections with the time delay at the road intersection will form a triangle-like shape.
Let us use the mathematical expression of the area of the triangle. Figure 5, below, shows a sketch of four-way signalized road intersections.

Traffic Control Delay at Four-Way Signalized Road Intersections
To determine the control delay of vehicles at a road intersection, the queuing theory is used. Let us assume that we have an accumulation of vehicles at a road intersection waiting for the traffic lights to turn green. The accumulation of these vehicles at the road intersections with the time delay at the road intersection will form a triangle-like shape Let us use the mathematical expression of the area of the triangle. Figure 5, below, shows a sketch of four-way signalized road intersections. θ ijkl = the traffic signal offset at road intersections i, j, k, and l. T = the traffic flow analysis period duration (in hours). EF ijkl = the adjustment factor of the road intersections i, j, k, and l (hours). C ijkl = the traffic flow capacity of road intersections i, j, k, and l (vehicle/hour). C = the traffic cycle length (seconds). g ijkl = the green time length at road intersections i, j, k, and l (seconds). X ijkl = road intersections i, j, k, and l's degree of saturation (seconds). d 1ijkl = intersections i, j, k, and l's uniform delay (seconds/vehicle). d 2ijkl = intersections i, j, k, and l's increment delay (seconds/vehicle).
The control delay = d (C) ijkl is calculated as: where, The average delay at each road intersection is d(θ) ijkl , which equals out to Equation (4).
C ijkl = the cycle lengths of intersections i, j, k, and l. g f ijkl = the green time lengths for intersections i, j, k, and l. L ijkl = the lost time of intersections i, j, k, and l. θ ijkl = the traffic flow conditions required for the calculation of the traffic signal offset of intersections i, j, k, and l.
The travel time t ijkl of each vehicle on the road intersections (i, j, k, and l) is calculated as: The free flow travel time t ijkl (0) of each vehicle on the intersections (i, j, k, and l) is calculated as: Equation (6) validates the pre-existing equation related to the traffic flow at road intersections, i.e., that speed = distance time . Using time as the subject of the formula.
In Equation (7), time is t ijkl , distance is I ijkl and speed is V ijkl . I I Jkl = the distance which has already been covered by the vehicles before arriving at the road intersections (i, j, k, and l) The total delay (d ijkl ) of the vehicles at the road intersections is the sum of the control delay d(C) ijkl at each intersection, the average delay, and the travel time of vehicles at each road intersection (t ijkl ).
The objective function of controlling real-time traffic flow at each road intersection in this study is minimized to the total travel time (TTT) at each road intersection.
To minimize TTT: Through the combination of Equations (8) and (9), we get (10): The relationship between the cycle lengths at each road intersection C ijkl , the green time lengths at each road intersection (g f ijkl ) and the lost time (L ijkl ) at each road intersection is explained by the constraints in (10) above. The condition required for the calculation of the signal offset (θ ijkl ) on the network is described by the constraint below (where and n h is loop multiplication); The constraint in Equation (13) defines the interval of the feasible cycle length values at each road intersection.
C min <= C ijkl ≤ C max (13) The constraint in Equations (13) and (14) above explains the interval of the feasible green time length values at each road intersection i, j, k, and l.
The constraints in Equation (15) are the interval of the feasible traffic signal offset time length values at each road intersection.

Development of the ANN-PSO Model
Kennedy and Eberhart initially created the PSO method in the late 1990s [66]. This evolutionary algorithm benefits from a rapid rate of convergence when compared to other evolutionary algorithms, and it is a continuous process [67]. Therefore, it has been applied to perfection in many engineering applications [68][69][70]. In this technique, a cost function that is supposed to undergo minimization and maximization is defined. Then, a swarm of particles is created and distributed in the problem's 'D' dimensional space. Each particle comprise the problem variables, making it easier to calculate the cost fitness function. Conclusively, the velocity and position of each particle are updated with regard to Equation (17), until the PSO algorithm undergoes convergence.
Furthermore C 1 is a cognitive variable displaying the degree of the local search, but C 2 is a social variable for a global search. Besides this, r 1 a and r 2 a are two non-dependant variables s uniformly distributed between zero and one, and 'w' is known as the inertia weight applied in preserving the previous velocity of the particles during the process of optimization. Dt represents the time interval between the position and velocity when they are updated. Usually, the parameters during the updates are equal to 1. The artificial neural network model training is efficient when a problem has been minimized, which can be solved by applying a conventional or metaheuristic algorithm. However, in a hybrid artificial neural network trained by particle swarm optimization, the PSO functions to minimize errors during the ANN model training by knowing the optimum parameters for weights and biases of the ANN-PSO model [71]. Therefore, in this research, variables are known as the weights and biases, and the feasible space of the problems is dependent on the interval time at which these variables change. The cost function (fitness function) of the ith particle can be explained in terms of the root mean squared error [72]: where: E = the cost fitness parameter T kl is the cost (fitness) value, and also the target value. P kl is the output predicted depending on the w i (weights)and b i (biases) S is the number of training data. O = the number of neurons.
To develop an ANN-PSO model, these steps must be adhered to: 1. Take into consideration the number of hidden neurons in the hidden layers and develop a neural network model using the initial weights and biases.

2.
The reformation of the weights and biases, where there can be a representation of the location of a particle in a D-dimensional space of the problem, and D is the number of weights and biases.

3.
During the iteration of each of the particles, the output values can be predicted and then mathematically calculated for the value of the cost function in Equation (18).

4.
Update the location of particles in the PSO algorithm for a number of populations and iterations until the target output is fulfilled. In summary, there will be a minimization of the cost function.
The ANN-PSO used in this research study was developed in the MATLAB environment, with different artificial neural network architecture layers. The number of input variables is 13, which is also equal to the number of independent variables. The number of output neurons is also one, in tandem with the overall number of dependent variables. This study's number of neurons used varied between five (5) and ten (10), respectively. In this research study, we took into consideration the acceleration factors (C 1 and C 2 ), the swarm population size, and the number of neurons. The acceleration factors were selected randomly between 1 and 3, and the swarm population size was chosen from the options of 10, 20, 50, 100, 200, and 400. This research study considered 5,6,7,8,9, and 10 number neurons to achieve the best optimal results of the ANN-PSO model. The ANN-PSO model training will only be stopped or terminated when the objective function iteration has been fulfilled. The following benchmark was adhered to. The benchmark was:

2.
The training run will be terminated if the objective function is not up to a specific fixed parameter.
The number of neurons, the swarm population size, the accelerating factors C 1 and C 2 , and the time is taken to train each number of neurons was taken into consideration. The MATLAB codes used to develop the ANN-PSO model have been deposited in a GitHub repository. This is the link to the MATLAB codes: https://github.com/Olayode1989/ ANN-PSO-codes.git, accessed on 1 September 2021. The ANN-PSO training and testing were carried out in the MATLAB environment by following the steps in Figure 6.

GitHub
repository. This is the link to the MATLAB codes: https://github.com/Olayode1989/ANN-PSO-codes.git, accessed on 1 September 2021. The ANN-PSO training and testing were carried out in the MATLAB environment by following the steps in Figure 6.

Development of the ANN Model
ANN models are intelligent models motivated by the biological neural networks of both human beings and animals. This provides the learning patterns and high accuracy of model predictions of model problems in high-dimensional space [73][74][75][76]. An artificial neural network model can map the association between the inputs and outputs even if the datasets are complex or noisy. A Multilayer Perceptron (MLP) is not too complex. It possesses an effective feed-forward neural network model. An MLP neural network comprises an input layer, hidden layers (depending on the neural network model), and an output layer [77,78]. The input layer comprises of input parameters and transfers them to the neurons in the hidden layer. The value of these inputs, combined with a value of bias, is transformed by an activation function, as explained figuratively in Figure 7. Thereafter, the output signal is moved to the neurons in the next layer.

Development of the ANN Model
ANN models are intelligent models motivated by the biological neural networks of both human beings and animals. This provides the learning patterns and high accuracy of model predictions of model problems in high-dimensional space [73][74][75][76]. An artificial neural network model can map the association between the inputs and outputs even if the datasets are complex or noisy. A Multilayer Perceptron (MLP) is not too complex. It possesses an effective feed-forward neural network model. An MLP neural network comprises an input layer, hidden layers (depending on the neural network model), and an output layer [77,78]. The input layer comprises of input parameters and transfers them to the neurons in the hidden layer. The value of these inputs, combined with a value of bias, is transformed by an activation function, as explained figuratively in Figure 7. Thereafter, the output signal is moved to the neurons in the next layer. The mathematical formulation of Figure 7 is shown below: x i and y j are the nodal values in the preceding layer, I, and the present layer, j. n is the overall number of the nodal values from the preceding layer. w ij and b j are the weights and biases of the ANN. The artificial neural network needs to be trained to display effective regression values performances. ANN model training means that the weights and biases of the ANN network are dependent on the minimal error between the relationships between the actual and network values. Therefore, the ANN network training process leads to the gradual minimization of the problem. Backpropagation (BP) algorithms are primarily applied for neural network training [79,80]. The Levenberg-Marquardt Algorithm (LMA) is usually identified as the fastest and most reliable training algorithm [81,82]; thus, we applied the backpropagation algorithm in this research study. When an artificial neural network is adequately trained in the MATLAB environment, it will function as a black-box model, explaining the relationship between a complex dataset, which comprises an input and output (irrespective of the number of variables). An ANN comprises complex mathematical processing units known as neurons. These neurons are situated in a place called the black box. These neurons will create a bond using weights and biases. Even though they consist of neurons, they also comprise of three significant layers, namely input layers, hidden layers, and output layers. The neurons are placed in the hidden and output layers, while the input layers do not contain neurons. In recent years, Artificial Neural Networks (ANN) have become a significant option in modelling due to their reduced computational time, efficient accuracy, and capability to show the relationships between inputs and outputs, depending on the data. The application of ANNs is limited to approximation; however, they also comprise classification, clustering, forecasting, pattern recognition, and image processing. There are different types of ANN depending on their architecture and model variables. The most widely used ANN is the backpropagation feedforward neural network, also known as a Multi-layer Perceptron (MLP), as shown in Figure 8. After the neural network toolbox is opened in MATLAB, the training will be conducted using the input traffic flow datasets and the output traffic datasets (traffic volume). The inputs are categorized into thirteen columns, while the output traffic datasets (traffic volume) are in the same Microsoft excel sheet used for the input datasets (Table 1 shows the traffic data inputs and output). The traffic datasets' training was carried out to investigate the optimal traffic flow variables of various types of weights and biases of the ANN model. When the ANN model training has been carried out on the datasets, the neural network's performance is validated by applying independent variables. The ANN model training and testing are regarded as optimal when the fitness function characteristics, such After the neural network toolbox is opened in MATLAB, the training will be conducted using the input traffic flow datasets and the output traffic datasets (traffic volume). The inputs are categorized into thirteen columns, while the output traffic datasets (traffic volume) are in the same Microsoft excel sheet used for the input datasets (Table 1 shows the traffic data inputs and output). The traffic datasets' training was carried out to investigate the optimal traffic flow variables of various types of weights and biases of the ANN model. When the ANN model training has been carried out on the datasets, the neural network's performance is validated by applying independent variables. The ANN model training and testing are regarded as optimal when the fitness function characteristics, such as the R 2 , are of values that are closer to one.

Artificial Neural Network Model
In this research, an artificially intelligent model called an Artificial Neural Network (ANN) model was used to model the traffic flow at a four-way signalized road intersection, using the road transportation systems in South Africa as a case study. The traffic flow parameters measured include the number of vehicles on the road, traffic density, vehicular speed, traffic volume, and time, which were measured over a certain period of time. The location of the road intersections was densely populated and comprised of seven days of the week, which encompasses all the different likelihoods of what could hinder or interrupt the free-flowing movement of vehicles, thereby causing traffic congestion. Two hundred and fifty-nine (259) datasets were collected from these roadsites. These traffic datasets from each of the four road intersections were used for the ANN and ANN-PSO training, testing, and validation. The traffic dataset validation and testing were carried out to verify the efficiency of the ANN and ANN-PSO model. The ANN and ANN-PSO training, validation, and testing were executed in the MATLAB 2015a environment. The following traffic flow variables from the South Africa Road Transportation Network were used for the model training, validation, and testing:

1.
Network inputs: The traffic density, number of light vehicles, average speed of light vehicles, time of day of light vehicles, the average speed of long trucks, time of day of long trucks, number of long trucks, the average speed of medium trucks, time of day of medium trucks, number of medium trucks, number of short trucks, the average speed of short trucks and time of day of short trucks.
The breakdown of how the traffic datasets were divided and used for the ANN training, validation, and testing is shown in Table 4.  Figure 9, below, shows the ANN model's neural network architecture used for the modelling of the traffic flow at the roadsites. There are 13 inputs and six hidden layers, and the one output shows the best artificial neural network model.
ing, validation, and testing were executed in the MATLAB 2015a environment. The following traffic flow variables from the South Africa Road Transportation Network were used for the model training, validation, and testing: 1. Network inputs: The traffic density, number of light vehicles, average speed of light vehicles, time of day of light vehicles, the average speed of long trucks, time of day of long trucks, number of long trucks, the average speed of medium trucks, time of day of medium trucks, number of medium trucks, number of short trucks, the average speed of short trucks and time of day of short trucks. 2. Network Output: Traffic volume.
The breakdown of how the traffic datasets were divided and used for the ANN training, validation, and testing is shown in Table 4.  Figure 9, below, shows the ANN model's neural network architecture used for the modelling of the traffic flow at the roadsites. There are 13 inputs and six hidden layers, and the one output shows the best artificial neural network model.    The best validation performance network for the traffic datasets is shown in Figure  10. Figure 11 shows training, testing, and validation regression values of 0.96086, 0.99169, 0.97258, and an overall regressing value of 0.96722 for the traffic datasets at the four roadsites. These results clearly show that the traffic data's inputs and outputs are well  The best validation performance network for the traffic datasets is shown in Figure 10. Figure 11 shows training, testing, and validation regression values of 0.96086, 0.99169, 0.97258, and an overall regressing value of 0.96722 for the traffic datasets at the four roadsites. These results clearly show that the traffic data's inputs and outputs are well correlated. Figure 10. The validation performance for the ANN model of the traffic datasets at the roadsites.
The best validation performance network for the traffic datasets is shown in Figure  10. Figure 11 shows training, testing, and validation regression values of 0.96086, 0.99169, 0.97258, and an overall regressing value of 0.96722 for the traffic datasets at the four roadsites. These results clearly show that the traffic data's inputs and outputs are well correlated. Figure 11. The ANN model results for the best prediction of the traffic datasets (13-6-1). Figure 11. The ANN model results for the best prediction of the traffic datasets (13-6-1). Figure 10 shows the training variations of the mean square error (MSE) of the training, testing, and validation data variations. The gradient epoch of the ANN performance of the roadsite traffic datasets shows an epoch of 9 throughout the ANN model training and testing. The validation check of the ANN model was at 0.97258. This indicates the efficiency of the input and output variables. Figure 11, above, shows an overview of the corresponding traffic performance evaluation indices of the MSE and R 2 values for the ANN model training and testing of the traffic data. It is apparent from Figure 11 that the best optimum training performance was achieved when the number of hidden neurons was 6, i.e., 13-6-1. The single most striking observation to emerge from this ANN model is that the ANN parameters, number of hidden neurons, and number of epochs significantly affect the traffic flow dataset's performance prediction; the optimum networks obtained considering the traffic dataset's show a training performance of R 2 = 0.96086 and testing performance of R 2 = 0.99169, with six being the number of hidden neurons.

Artificial Neural Network-Particle Swarm Optimization Model
The breakdown of the way in which the traffic datasets were divided and used for the ANN-PSO training and testing is shown in Table 5 below. The 259 traffic datasets from the four-way signalized road intersections were divided into 219 and 40 for training and testing. To achieve the best optimum output, a trial-and-error approach was used to discover the best value for the number of hidden nodes, iterations, and acceleration factors. Sigmoid and linear functions were used for the ANN-PSO model for the hidden and output node activation functions. The best optimal parameters for both the training and the testing of the performance of the ANN-PSO model of the traffic flow at the four intersections, as shown in Table 6, are:   To evaluate the accuracy of the ANN-PSO model, the observed and predicted output of the traffic volume of the vehicles at each of the four roadsites were compared in Figure 12b, with the testing performance of the model being 0.9971. Table 6, above, shows that the corresponding traffic performance evaluation indices of the MSE and R 2 value for the training and testing traffic datasets have been presented. It can be discerned from the parametric study of the ANN-PSO hybrid model that the best optimum training performance was obtained when the total number of neurons was 5, the swarm population size was 400, and the best-achieving acceleration factors C 1 and C 2 were 1.5 and 2, respectively. An evaluative observation made from Table 6 is that the ANN-PSO's parameters affect the traffic congestion dataset's performance prediction. Besides this, the best training and testing results were not the same. Therefore, the optimum network obtained considering the traffic dataset's training performance is training R = 0.99952. The traffic dataset's best testing performance is R 2 = 0.9971, which has five hidden neurons. Figure 12a shows the training of the traffic data yield R = 0.99952, MSE = 23.161, while during the testing and validation of the results R 2 = 0.9971 (Figure 12b).

Conclusions and Future work
The current study aimed to determine the comparative traffic flow prediction performance between a heuristic ANN model and a hybrid ANN-PSO model in modelling vehicles' traffic flow at four-way signalized road intersections. A 259 traffic dataset database was collected from four signalized road intersections in South Africa's Road transportation system. Thirteen input parameters and one output parameter were taken into consideration. Based on these artificially intelligent approaches (ANN and ANN-PSO) used for the traffic flow data, the following conclusions can be drawn from the present study: