Using Machine Learning to Predict Pedestrian Compliance at Crosswalks in Jordan

: This study employs machine learning (ML) techniques to predict pedestrian compliance at crosswalks in urban settings in Jordan, aiming to enhance pedestrian safety and traffic management. Utilizing data from 2437 pedestrians at signalized intersections in Amman, Irbid, and Zarqa, four models based on different ML algorithms were developed: an artificial neural network (ANN), a support vector machine (SVM), a decision tree (ID3)


Introduction
Considering urban expansion and the continuous increase in population density in urban areas, pedestrians are emerging as an essential element in the fabric of traffic movement within cities. Pedestrian traffic behaviors and their interaction with signalized intersections constitute an integral part of public safety and the efficiency of the urban transportation system.Previous studies in this area, such as those conducted in several cities worldwide, suggest that improving our understanding of pedestrian behaviors can significantly reduce accidents and enhance traffic safety [1].Looking at the statistics, the impact of traffic accidents on pedestrians in urban areas worldwide is clearly evident.The World Health Organization (WHO) reports that pedestrians constitute a large proportion of road crash fatalities (23% of fatalities) compared to 30% for four-wheel vehicle occupants, 21% for two-and three-wheeler users, 6% for cyclists, and 20% for other users [2].These high percentages call for effective preventive measures to protect them.In Jordan, especially in the largest three cities, Amman (the capital), Irbid, and Zarqa, pedestrians suffer from similar challenges, as infrastructure and traffic behaviors contribute to increased road movement risks [3][4][5].
The importance of research and investigation in this field is highlighted not only at the local level in Jordan, but also at the regional and global levels, as researchers and planners seek to develop innovative strategies to address diverse traffic challenges and enhance traffic safety for all road users.Using ML tools to predict pedestrian behavior can provide valuable insights and contribute to designing more effective preventive measures.By analyzing detailed data derived from the intersections under study and marked in Amman, Irbid, and Zarqa, this research aims to fill the gap in the current literature and make new contributions to improving the understanding of traffic dynamics.The issue of understanding the dynamics and variables that frame human interaction with the surrounding environment around them has a growing need that has led to a remarkable development in pedestrian behavior research in urban areas [6].Accordingly, closely studying pedestrian behavior at traffic lights in urban areas with high pedestrian density sheds light on multiple aspects of the interaction between pedestrians and urban infrastructure and explores the extent to which pedestrians comply with traffic laws and regulations designed to protect them.Specifically, many of these studies focus on observing and analyzing pedestrian behavior at traffic signals, as these points are among the most sensitive and dangerous areas for pedestrians in the urban environment [7].Interest in the extent to which pedestrians adhere to the traffic signals assigned to them comes from the fact that it directly reflects traffic safety and security levels in cities.
The remainder of this paper is organized as follows.The Section 2, "Literature Review", surveys previous studies and key findings relevant to the use of ML in predicting pedestrian behaviors, focusing on crosswalk compliance within various contexts.In the Section 3, "Methodology", the ML techniques and data collection methods used in this study are described.In the Section 4, "Results and Discussion", the results of the predictive models are presented and discussed.Finally, in Section 5, "Conclusions", the findings are summarized, their implications for pedestrian safety and urban planning in Jordan are discussed, and directions for future research are suggested.

Literature Review
When examining previous research globally, many tools, software, statistical methods, and ML algorithms are available to analyze traffic-related data sets in urban areas [8][9][10][11][12][13].All these technological procedures have facilitated the process of collecting and analyzing data and obtaining results and conclusions that help in understanding the variables and factors that impact the safety of pedestrians and the way they think and act when crossing urban intersections.For example, when talking about the most critical computer programs for traffic modeling purposes, we find that software such as SUMO Simulation of Urban Mobility (https://eclipse.dev/sumo/)[14] and VISSIM (https: //www.ptvgroup.com/en/products/ptv-vissim)[15] are used to simulate the urban environment and its elements, including pedestrians, infrastructure, vehicles, and traffic control devices such as traffic lights.Also, the process of recording videos of these urban elements and the methods for identifying their characteristics, such as the number of pedestrians and vehicles, their speeds, and their categories, can now be implemented using systems such as Video Management Systems (VMS).In addition to the above, analyzing and visualizing spatial elements has become available through geographic information systems (GIS) such as ArcGIS (https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview)[16].
Statistical software packages like R (https://www.r-project.org/)[17] and SPSS (https: //www.ibm.com/products/spss-statistics)[18] are used as statistical tools to analyze quantitative data and find trends and patterns through statistical methods like hypothesis testing, analysis of variance (ANOVA), and linear regression.Additionally, spatial analysis tools, which are integrated within GIS or available as packages in R, are used to examine spatial and spatiotemporal data.These resources aid in a more thorough comprehension of the interactions between pedestrians and traffic signals in urban areas.Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs) were used as ML tools and algorithms to analyze photos and videos to track pedestrian movement and classify behaviors at traffic crossings.In addition, pedestrian activity patterns were analyzed and categorized using traditional ML methods like Principal Component Analysis (PCA), Random Forests, and Support Vector Machines (SVM).Additionally, research has been performed on using Reinforcement Learning Modeling to create interactive models that can adapt to changes in the infrastructure or traffic environment while understanding how pedestrian behavior changes in response.These tools and techniques provide the basis for a complex and multidimensional study of pedestrian behavior in urban areas.They allow precise analysis of interactions between pedestrians and traffic lights and the development of effective strategies to improve traffic safety.
Many previous studies in various geographical locations demonstrated the ability of ML models to analyze pedestrian safety conditions at urban signalized intersections.For example, Berriel et al. (2018) [19] used deep learning tools to accurately classify pedestrians at crossings through field images taken from OpenStreetMap and Google Street View in multiple regions of Brazil.The research results demonstrated an accuracy of approximately 94.12% using a convolutional neural network (CNN) with the VGG architecture and further improved to 96.3% by refining their database.This study confirmed the great benefit of automated data collection in enhancing training and maintenance resilience in diverse urban environments.In addition, Marisamynathan and Vedagiri (2018) [20] investigated pedestrian behavior while crossing signalized intersections in India.Using the analysis of videos that recorded pedestrians crossing and their behavior at intersections, the researchers developed a model capable of predicting pedestrian behavior, which showed that 46% of pedestrians do not adhere to crossing rules in order to not be delayed due to waiting and to save time.The results of this research are essential for urban planners and traffic engineers who aim to raise the level of safety and increase pedestrian compliance with traffic rules when crossing the road near urban intersections.
In the United States, Kutela and Teng (2019; 2020) [21,22] conducted studies using Bayesian Networks Analysis to investigate the relationship between pedestrian and car behaviors at signalized crosswalks in Nevada.Their research provided insight into the crucial aspects that impact pedestrians' compliance with crossing signals and cars' willingness to yield, delivering a comprehensive understanding that can inform pedestrian infrastructure design to enhance compliance rates.In addition, Zhang et al. (2020) [23] investigated the forecasting of pedestrian crossing during red lights in Florida using a long-short-term memory (LSTM) neural network.Their model exhibited a notable accuracy rate of 91.6% in forecasting pedestrian motives, highlighting the potential of sophisticated computational models to actively improve pedestrian safety.Implementing this method into communication networks between automobiles and infrastructure has the potential to assist in avoiding crashes by notifying drivers of possible unforeseen pedestrian crossings.These studies collectively show that machine-learning approaches can be used to understand and enhance pedestrian compliance and safety at crosswalks.These approaches offer a significant framework for our research in Jordan, as they may be modified to tackle specific issues in managing pedestrian traffic.
In 2021, Noh et al. [24] investigated the relationship between vehicle and pedestrian conflicts in unregulated crossing areas in South Korea using an analytical methodology.This study focused on understanding the ability of pedestrian behavioral analysis, both qualitatively and causally, to predict the factors that may lead to conflict between pedestrians and vehicles, which paves the way for improving traffic safety in urban areas.Also, Dominguez et al. (2021) [25] and subsequent studies published by Losada et al. (2022) [26] and Fastkiotis et al. (2022) [27] aimed to raise the level of understanding of pedestrian environment and safety through ML and artificial intelligence (AI) methodologies.These studies have enhanced the ability to identify vehicles in smart transit areas and increased the predictive ability to determine the probability of crashes between vehicles and pedestrians in urban environments through virtual reality (VR) contexts.The significance of these studies lies in their increasing reliance on AI technology for the development of road and intersection safety models and their recognition of the field's crucial role in enhancing traffic safety in urban areas.
Paul and Moridpour (2023) [28] and Hossain et al. ( 2023) [29] made a distinctive addition to their research as they evaluated the level of service for pedestrians and how to identify and predict patterns of behavior in pedestrian crashes.They had essential recommendations on improving pedestrian behavior using ML models, their interaction with the surrounding environment at intersections, and reducing pedestrian crashes at nighttime to enhance intersections in urban areas and make them suitable and safer for pedestrians.In addition, researchers Cai et al. (2024) [30] made a good contribution from China, as the results of their research showed high accuracy in predicting the probability of pedestrian crossings and their speed, which helps improve pedestrian movement modeling using traffic simulation software (PTV−VISSIM2020).These practical applications emphasize the importance of these tools in the application of intelligent transportation systems (ITS) to enhance pedestrian safety in urban areas.
This literature review demonstrates that numerous previous studies have employed diverse methods to investigate pedestrian behavior and safety.By examining and analyzing the contributions of these studies, researchers attempted to (1) collect all relevant data and factors that could have an impact on pedestrian behavior, (2) focus the results on the complex dynamic relationships between pedestrian behavior and these factors in urban environments, and (3) suggest future research directions that illustrate the importance of these technological methods in examining these relationships and their applications to the practices of specialists in traffic management, urban planning, and traffic safety.Research has shown that there are multiple factors that play a role in motivating pedestrians to follow traffic laws and directions, especially those related to infrastructure, whether by walking on sidewalks or crossing the road through pedestrian crossings or bridges and tunnels designated for that purpose.This study will examine the relationship between infrastructure and its design in urban areas and compliance with traffic laws and pedestrian crossing parameters.In this context, it is essential to conduct a literature review that collects and discusses the diverse results of these studies.Finally, the literature review has a key role in establishing a comprehensive understanding of the various variables that may impact pedestrian behavior in urban areas and the extent of their compliance with traffic regulations therein.This review can enhance current knowledge about pedestrian behaviors and provide valuable guidance to researchers, urban planners, and decision-makers.

Materials and Methods
Data were collected from nine signalized intersections located in three major governorates of Jordan: Amman (the capital), Irbid, and Zarqa.The selected cities for this study were identified based on their dense population and high vehicle registration rates compared to other cities in Jordan.The locations are situated in areas with significant pedestrian activity and traffic flow.Each selected intersection is a typical four-leg configuration with fixed cycle lengths and two-way traffic.These intersections have three lanes per direction and a speed limit of 60 km/h, allowing for bidirectional pedestrian flow.The major crosswalk exhibiting the highest pedestrian traffic was selected from each intersection for pedestrian crossing behavior "detailed" analysis.Table 1 shows the characteristics of selected sites.
The video camera locations were adept at detecting pedestrians as they entered and exited the crosswalk, precisely capturing their movements (Figure 1).Data on pedestrian crossing behaviors were collected during working days, from Sunday to Thursday, to maintain consistency and avoid the varied conditions typical of weekends and peak pedestrian flows.The minimum sample size was analyzed to ensure adequate population representation.A total of 2437 pedestrians were identified and recorded from the video footage.The video recordings provided comprehensive data on various aspects of pedestrian crossing behavior.Table 2 presents the variables extracted from the videos, along with their respective descriptions.
Pedestrian Age 1 = child, 2 = adult, and 3 = elderly.Table 2 presents the variables extracted from the videos, along with their respective descriptions.

Variables Description
City Amman, Irbid, and Zarqa Pedestrian Gender 1 = male, and 2 = female.Pedestrian Age 1 = child, 2 = adult, and 3 = elderly.Carrying Bags 0 = pedestrian not holding bags, and 1 = pedestrian carrying bags.Opposing Pedestrian Number of pedestrians crossing in the opposite direction.Pedestrian Direction 0 = downstream to upstream, and 1 = upstream to downstream.Crossing Conditions 0 = single-stage crossing, 1 = two-stage crossing.Using Mobile Phone 0 for pedestrian not using mobile phone while crossing and 1 for pedestrian using mobile.Pedestrian Compliance 0 = pedestrian who does not cross within marked crosswalk and does not adhere to signal, 1 = pedestrian who complies with traffic signal and crosswalk together.
The dataset comprising observations of 2437 pedestrians was utilized to develop predictive models employing four distinct ML techniques: artificial neural network (ANN), support vector machine (SVM), decision tree (ID3), and random forest (RF).The following sections offer a concise overview of each technique.

Decision Tree (ID3)
Decision tree learning is an ML technique used for building classification models capable of predicting the possible value of a specific attribute (i.e., target attribute) based on the values of other attributes (i.e., input attributes).Such classification models are trained on historical data, in which the value of the target attribute is known.The decision tree is represented using a directed graph where each interior node is an input attribute with several branches equal to the possible values of that attribute.Each leaf node contains the value of the target attribute given the values of the input attributes along the path that led to that leaf node starting from the root node.The input attributes that reside in the higher levels of the tree have a higher impact on deciding the value of the target attribute.
There are several types of decision tree algorithms that can be used to build classification models.In this work, the Iterative Dichotomiser 3 (ID3) algorithm is used.The ID3 algorithm is an algorithm that performs a top-down, greedy search on a given data set to teach the nodes.The direction at each node is determined using Information Gain (IG).IG represents the reduction of disorder produced by splitting the data at each internal node and thus determines how accurately a given attribute can separate the target classes.The disorder in the target attribute is measured using Entropy.Entropy is computed as follows: where S is the data set used for building the decision tree, c is the number of classes in the target attribute, and Pi is the ratio of instances with class I to the total number of instances in the data set.The IG for feature A is calculated as follows: where n is the values feature A can hold, S V is the instances where A has value v, |S V | the total number of instances in S V , and |S| the total number of instances in S.

Artificial Neural Network (ANN)
Artificial neural network (ANN) is one of the most effective ML techniques in building prediction models.The basic structure of an ANN is a set of layers, each with multiple nodes.Nodes in each layer are connected with nodes in adjacent layers by links.The input layer has several nodes equal to the number of input attributes.The output layer provides the final result performed by the ANN.The number of nodes in the output layer depends on the number of values the target attribute can hold.Zero or more hidden layers can be added between the input and output layers.Each node in the network receives data from the previous layers, performs some operations, and passes the result to the next layer.Each link between two nodes is assigned a weight that controls the signal between nodes.ANNs gain knowledge by the process of finding the best solutions.Such knowledge is translated to adjustment to weight values.
Several types of ANNs have proven effective in building prediction models such as Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN).In this study, MLP is employed for predicting pedestrian compliance at crosswalks in Jordan.MLP is a fully linked network consisting of three or more layers.
Each node is associated with a nonlinear activation function that determines the output given a set of inputs.In this work, a sigmoid function is used to construct the classifiers: The backpropagation technique is used in MLP to train the network.In this technique, the weight values are modified after processing each instance in the data set.MLP gains knowledge by trying to solve the least mean squares minimization function: where d i is the targeted output, and a j is the value generated by the jth output neuron.The changes to the weights are determined by calculating the local gradient (δ) and moving backward starting from the output layer: where (n) is the training iteration, (η) is the learning rater parameter, and α is the momentum constant.

Random Forest (RF)
Random forest is a supervised learning technique that incorporates the outcomes of multiple decision trees to reach a single outcome.This helps prevent the problem of overfitting caused by relying on a single decision tree.The RF applies the bagging technique to create multiple datasets to train individual decision trees.The training data are randomly sampled with replacement.At the building stage of each tree, randomly selected features are used for splitting at each split.This can significantly improve the performance of RF.For classification tasks, the prediction of the RF is decided by a majority vote among the individual trees.

Support Vector Machine (SVM)
For classification and aggregation, we use a supervised ML algorithm known as a support vector machine (SVM).SVM aims to identify the hyperplane in an n-dimensional space that can best separate the instances into different classes.
An essential aspect of SVM is its capability to manage both linearly separable and non-linearly separable data through a method known as the kernel trick.This technique converts the original input space into a higher-dimensional space, potentially rendering the data points linearly separable.This transformation allows SVM to learn complex decision boundaries effectively within the original feature space.
In the training phase, SVM aims to minimize a cost function, usually composed of the margin and the classification error.To solve this optimization problem, various methods, such as quadratic programming or convex optimization, are employed, depending on the specific configuration of the SVM algorithm.

Results and Discussion
The analysis of pedestrian behavior at various intersections revealed a predominance of male pedestrians, making up 61.7% of the total, with females accounting for 38.3%.Adults were the majority, especially near Yarmouk University and shopping centers, while children and elders were fewer, with distinct spatial variations in their distribution.Most pedestrians crossed streets directly without pauses, and a significant number carried personal items, particularly in areas near schools and business districts.A majority of 72.5% crossed paths without any interaction with vehicles, whereas 19.7% crossed with moderate risk and 7.8% with high risk of vehicle interaction.The preference for walking over running was evident, with 95.8% walking and only 4.2% running across the street.
The classification models were built using a 10-fold cross-validation method.This method divides the data set into ten subsets, and a process with ten iterations is followed.At each iteration, a new subset is used for testing the model, while the rest of the subsets are used for training.
The prediction models are evaluated using the confusion matrix.This matrix reveals, for each class, the number of correctly and incorrectly classified instances.A 2 × 2 confusion matrix for two classes is presented in Table 3.The TP, TN, FP, and FN can be described as follows: • True Positive (TP): positive instances classified as positive.
• True Negative (TN): negative instances classified as negative.

•
False Positive (FP): negative instances classified as positive.

•
False Negative (FN): positive instances classified as negative.The accuracy rate or precision is one of the most widely adopted measures to evaluate the classifier's overall performance.The higher the accuracy is, the better the model performs.The accuracy represents the ratio of correctly classified instances to all instances and is calculated as follows: True Positive Rate (TPR) and False Positive Rate (FPR) are also important measures.TPR or Recall represents the ratio of instances that were correctly classified as positive to all positive instances, while FPR is the ratio of instances that were incorrectly classified as positive to all instances that are not positive: Note that the overall accuracy of a classifier can also be calculated by taking the weighted average of all recall values.
Precision is another measure representing the ratio of instances that were correctly classified as positive to all instances that were classified as positive.The model performance increases as the TPR and precision values increase and FPR value decreases.
Finally, the ROC curve plots the TPR against the FPR at several threshold settings, which represents the trade-offs between true positive (benefits) and false positive (costs).The ROC values range from 0.0 to 1.0, where 1.0 is the optimal case.Table 4 presents the performance of ANN, ID3, RF, and SVM.Each model's total accuracy was computed, as were the weighted precision and weighted FPR for each method.Also, Figure 2 provides a graphical representation of the performance achieved by all classification models.As noted, the RF technique achieved the best accuracy (81.2%) and precision (80.9%) in determining pedestrian compliance compared to the other methods.The FPR value was also low (25.2%),indicating that the model is effective and accurate.The ANN and ID3 techniques also performed well, with ID3 performing the worst in terms of FPR value.The performance of the SVM technique was the least accurate of the other techniques used in this study.
Table 4 and Figure 2 showed that the highest ROC value reached 90.4% using the RF technique.This relatively high value indicates the model's ability to distinguish the behavior of compliant and non-compliant pedestrians at urban intersections and its ability to deal with many characteristics of pedestrian behavior that fluctuate with changes in the characteristics of the study site and the surrounding infrastructure.All of this gives reliable predictions in real-time traffic management system applications.As for the SVM model, its results showed less efficiency, as the ROC area result did not exceed 50.4%.This result weakens the model's ability to deal with changes in pedestrian behavior at urban intersections.Finally, the average ROC area values for the ANN and ID3 models, which were 87.8% and 85.1%, respectively, indicated that these models have the ability, to some extent, to predict pedestrian behavior and their compliance with the regulations at urban intersections.However, there is a need to improve the variables in the models to achieve better predictions and accuracy in the future.
Appl.Sci.2024, 14, x FOR PEER REVIEW 9 of 14 Finally, the ROC curve plots the TPR against the FPR at several threshold settings, which represents the trade-offs between true positive (benefits) and false positive (costs).The ROC values range from 0.0 to 1.0, where 1.0 is the optimal case.Table 4 presents the performance of ANN, ID3, RF, and SVM.Each model's total accuracy was computed, as were the weighted precision and weighted FPR for each method.Also, Figure 2 provides a graphical representation of the performance achieved by all classification models.As noted, the RF technique achieved the best accuracy (81.2%) and precision (80.9%) in determining pedestrian compliance compared to the other methods.The FPR value was also low (25.2%),indicating that the model is effective and accurate.The ANN and ID3 techniques also performed well, with ID3 performing the worst in terms of FPR value.The performance of the SVM technique was the least accurate of the other techniques used in this study.Table 4 and Figure 2 showed that the highest ROC value reached 90.4% using the RF technique.This relatively high value indicates the model's ability to distinguish the behavior of compliant and non-compliant pedestrians at urban intersections and its ability to deal with many characteristics of pedestrian behavior that fluctuate with changes in the characteristics of the study site and the surrounding infrastructure.All of this gives reliable predictions in real-time traffic management system applications.As for the SVM model, its results showed less efficiency, as the ROC area result did not exceed 50.4%.This result weakens the model's ability to deal with changes in pedestrian behavior at urban intersections.Finally, the average ROC area values for the ANN and ID3 models, which were 87.8% and 85.1%, respectively, indicated that these models have the ability, to some extent, to predict pedestrian behavior and their compliance with the regulations at urban The previous variations in performance show that it is important to know about pedestrian conditions, the site's features, and the necessary prediction tasks.This is in addition to picking the right variables and factors when using ML models to study pedestrian behavior and how well they follow the rules when crossing urban intersections.As an example, the high performance of RF can be seen in its clustering method, which in turn reduces overfitting and error between individual decision trees.Furthermore, these findings are consistent with broader trends observed in pedestrian behavior research, where ensemble methods such as RF often outperform other models due to their ability to capture complex nonlinear relationships within large data sets.
In addition to the classification model, the ID3 provides a set of rules that can be very useful to transportation agencies and relevant stakeholders.Transportation agencies and relevant stakeholders include a variety of organizations and individuals who have an interest in or responsibility for ensuring the safety, efficiency, and effectiveness of transportation systems, such as the Ministry of Transport, municipal and local government agencies, law enforcement agencies, public transportation authorities, road safety organizations, non-governmental organizations (NGOs), transportation research and academic institutions, and engineering and consulting firms.These rules can help agencies by guiding the development of policies that improve pedestrian safety.Urban planners can use the rules to create intersections that encourage compliance and decrease accidents.Additionally, advocacy groups can utilize the insights from the rules to focus on educational campaigns about pedestrian safety.Moreover, law enforcement agencies can use the rules to focus their efforts on intersections and behaviors with the highest risk of non-compliance, thereby optimizing resource allocation.Table 5 provides some of these rules.
As shown in Table 5, there is a detailed explanation of the rule-based outputs of the decision tree algorithm (ID3) used in this study.These rules are essential to understanding the multifaceted nature of pedestrian compliance at signalized intersections in various urban areas in Jordan, specifically for the study areas of Amman, Irbid, and Zarqa.The rules cover several scenarios that combine pedestrian behavior and environmental variables to predict pedestrian crossing compliance.
The decision tree model identified essential variables that influence pedestrian compliance, including crossing type, pedestrian demographics (age and gender), use of mobile phones, carry bags, and the presence of opposing pedestrians.For example, rules such as "If Type of Crossing = 0 AND city = Amman AND carrying bags = 0 AND Opposing Pedestrian = 1 AND Pedestrian Direction = 0, then Output = 0" illustrate the compliance behavior under specific circumstances.These rules show that non-compliance may be more likely to occur when pedestrians in Amman encounter a certain type of crossing without carrying any bag, as well as with the existence of opposite pedestrians, and moving from downstream to upstream.
The rules also draw attention to the fact that pedestrian behavior at crossroads is affected by urban intersection features.Pedestrians of all ages in Amman are more likely to cross the street safely when not carrying heavy bags or speaking on their phones.However, according to the rules in Irbid, the impact of opposition to the presence of pedestrians on compliance varied with age and direction of travel.This conversation can teach us a lot about pedestrian behavior patterns and possible actions to make urban crossings safer for pedestrians.The rules can provide an excellent reference for urban planners and traffic management authorities, especially in the process of designing targeted traffic safety campaigns and modifications to existing infrastructure that may address specific behavioral trends and environmental interactions.
Applying the above rules, considering all the characteristics of the surrounding urban environment can effectively improve pedestrian safety, especially with the constant efforts of all relevant authorities to provide all means of pedestrian protection, enforce laws, and activate artificial intelligence tools to increase the efficiency and safety of complex urban environments.

Conclusions
In this study, pedestrian behavior at crosswalks was investigated using four models based on different ML algorithms that were developed: an artificial neural network (ANN), a support vector machine (SVM), a decision tree (ID3), and a random forest (RF).The data extracted from field videos of approximately 2437 pedestrians at the studied urban intersections provided a solid base to develop easily validated analytical and predictive models to raise the level of pedestrian safety in urban areas and compliance with traffic signals.Also, adults had a clear role in this study, as it was found that they were the most frequent users of intersections in urban areas with high economic activity, such as universities and shopping centers of various types.In addition, it was found that many pedestrians adhered to traffic laws when crossing the road rather than running, emphasizing that they did not ignore traffic signals.However, a noticeable percentage of pedestrians, especially children and the elderly, showed varying compliance, which can be attributed to a defect in the physical planning of intersections in addition to the nature of the urban environment.
Based on the results of this study, the performance of ML models was very promising, with random forests outperforming other models in terms of accuracy, precision, and false-positive rate management.This makes this method reliable for predicting pedestrian compliance at urban crosswalks.These models not only help to understand pedestrian behaviors, but they also have massive potential as a critical tool for urban planners, traffic management authorities, and road and traffic engineers to design safer and more efficient pedestrian infrastructure, raising optimism for the future of pedestrian safety.Moreover, the spatial analysis in this study shows significant differences in pedestrian behavior at different intersections, which confirms the critical role of local variables such as proximity to educational institutions and commercial centers.These factors significantly impact pedestrian flow patterns, compliance with road regulations, and the local environment.This vision emphasizes the importance of having knowledge and experience of the local environment in developing targeted interventions that suit specific urban environments and demographic profiles.
Future research should consider the possibility of integrating predictive models into real-time traffic management systems to improve preventive measures in pedestrian safety that dynamically interact with field variables.Also, data collection processes must be expanded to include more environmental and temporal variables, which can increase the efficiency and accuracy of predictions and the possibility of applying ML models.Finally, ML tools have demonstrated their high efficiency in analyzing and predicting pedestrian behavior at crossings, contributing to ongoing efforts to enhance pedestrian safety, and providing methodologies that apply to the development of other urban spaces facing similar challenges.

Figure 2 .
Figure 2. The performance achieved by all classification models.

Figure 2 .
Figure 2. The performance achieved by all classification models.
of Crossing = 0 AND city = Irbid AND Using Mobile Phone = 1 AND Opposing Pedestrian = 0 AND Pedestrian Direction = 0) 0 22 If (Type of Crossing = 0 AND city = Irbid AND Using Mobile Phone = 1 AND Opposing Pedestrian = 0 AND Pedestrian Direction = 1) 1 23 If (Type of Crossing = 0 AND city = Irbid AND Using Mobile Phone = 1 AND Opposing Pedestrian = 1 AND Pedestrian Gender = 1) 0 24 If (Type of Crossing = 0 AND city = Irbid AND Using Mobile Phone = 1 AND Opposing Pedestrian = 1 AND Pedestrian Gender = 2) 1 25 If (Type of Crossing = 1 AND Carrying Bags = 0 AND Opposing Pedestrian = 3 AND City = Irbid) 0 26 If (Type of Crossing = 1 AND Carrying Bags = 0 AND Opposing Pedestrian = 3 AND City = Amman) 1 27 If (Type of Crossing = 1 AND Carrying Bags = 0 AND Opposing Pedestrian = 3 AND City = Zarqa) 1 28 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Irbid AND Opposing Pedestrian = 0 AND Pedestrian Age = 1) 1 29 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Irbid AND Opposing Pedestrian = 0 AND Pedestrian Age = 2 AND Pedestrian Direction = 0) 0 30 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Irbid AND Opposing Pedestrian = 0 AND Pedestrian Age = 2 AND Pedestrian Direction = 1) 1 31 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Irbid AND Opposing Pedestrian = 0 AND Pedestrian Age = 3) 1 32 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Amman) 0 33 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Zarqa AND Pedestrian Age = 1 or 3) 1 34 If (Type of Crossing = 1 AND Carrying Bags = 1 AND City = Zarqa AND Pedestrian Age = 2) 2

Table 1 .
Characteristics of selected intersections.

Table 1 .
Characteristics of selected intersections.

Table 2 .
The selected features.

Table 2 .
The selected features.

Table 4 .
The performance of ANN, ID3, RF, and SVM.

Table 4 .
The performance of ANN, ID3, RF, and SVM.