Pedestrian Attribute Analysis Using Agent-Based Modeling

: Crossing a road outside of a crosswalk is a major cause of pedestrian fatalities. The aim of this study was to investigate this type of behavior for di ﬀ erent pedestrian attributes in terms of risk and gap acceptance using agent-based modeling techniques. An agent-based model was developed and tested to represent pedestrian behavior in di ﬀ erent situations. Di ﬀ erent pedestrian attributes were analyzed, including gender, age, type of clothing, carrying bags, using mobile phones, and crossing in a group. The results showed that pedestrians add a positive risk factor to the speed of approaching vehicles before evaluating a gap, then proceed with the crossing decision. The factor for the female pedestrians was smaller in comparison to their male counterparts, which may infer that they are more prone to taking risks during crossing compared to male pedestrians. Another interpretation can be that they have a better ability to discern vehicle speeds and thus a better assessment of the critical gap. Compared to pedestrians crossing individually, the factor was smaller for pedestrians crossing in a group, which can be an indication that pedestrians have a higher sense of safety when crossing as a group. Moreover, the analysis suggested that there is no di ﬀ erence in perception between old and middle-age pedestrians, pedestrians carrying bags or not, and pedestrians using a mobile phone while crossing or not. These results can be useful in evaluating pedestrian safety at midblock crossings and providing a framework for modeling this type of behavior in simulation models.


Introduction
When crossing a road at an uncontrolled multilane midblock section, pedestrians perceive the vehicles in different lanes, analyze the situation, and then select a gap before crossing. During this process, the main objective of every pedestrian is to identify an acceptable gap to cross the road safely. The gap accepted by each pedestrian depends on his or her speed, perception of the speed of the upcoming vehicles, and the distance between the pedestrian and the vehicles. Studying pedestrians' attributes and accepted gaps are key aspects in improving pedestrian modeling and simulation. A better understanding of these attributes results in a more accurate prediction of their behavior in different conditions. Furthermore, it is needed for conducting pedestrian safety analysis and estimating the safety level of different urban streets.
There are different methods used to study pedestrian movements. Initially, models of pedestrian movement have been focused on the macroscopic scale. This approach was adequate to study traffic from a planning perspective and to investigate the overall crowd density and capacity of a specific area [1,2]. As congestion became a more serious problem in many countries, there was more interest in active modes such as walking and cycling. Consequently, there was a need for methods that can be used to analyze pedestrian movements. Statistical regression analysis was one of the first modeling methods used for studying pedestrian movements. This method is based on using statistical regression techniques to identify the most critical factors influencing walking movements [3,4]. Another approach was the fluid-flow analysis method. In this method, pedestrian movements were treated as a fluid moving in the vicinity of different obstacles [5,6]. These two methods were used to study pedestrian movements from a macro-level perspective.
However, there is a need for methods that can incorporate the movement behavior of individual pedestrians. Therefore, microsimulation models representing individual vehicles and used for different applications were introduced [7]. Furthermore, pedestrian simulation models, such as cellular automata, social force, and agent-based models, were created. For these models to be used for predicting traffic conditions and to replicate observed traffic conditions and the behavior of individual pedestrians, it was important to have a calibration process. In fact, the credibility of these models depends on how well they are calibrated.
In the case of cellular automation models, each space is divided into discrete cells that can be empty or occupied by pedestrians or obstacles. In the simulation, each pedestrian will move through space by occupying cells and getting around other occupied cells [8,9]. The social force models depend on social interactions among pedestrians. Instead of treating pedestrians as isolated individual entities who walk alone with separate speed and direction of motion, pedestrians are treated as individuals who walk in groups. These types of models are popular in evacuation problems [10,11].
Another technique is agent-based modeling. In this technique, each pedestrian is studied as an agent and assigned specific attributes in order to respond to complex settings in the environment. All agents behave independently, which makes agent-based modeling one of the best techniques for modeling pedestrian movements [12,13]. These types of models have already attracted great attention in the transportation field because they offer a more detailed analysis of motorist and pedestrian behavior, can successfully model the complicated behavior of pedestrians, and can overcome some of the limitations of the other methods [14][15][16][17][18].
Different analysis methods were used to study the illegal pedestrian behavior at unmarked midblock sections. In Qatar, a study determined the crossing time for pedestrians who jaywalk as a function of several pedestrian and vehicle characteristics in addition to crossing behavioral attributes using multiple linear regression. The study showed age, gender, phone use, crossing with a group, and presence of a vehicle affected the crossing time [19]. In Greece, the gap acceptance for crossing at an uncontrolled section was studied, and a lognormal regression model was presented for the gaps accepted. The model contained several independent variables, including gender, vehicle size, distance, and illegal parking. The study indicated that the crossing decision was affected by the distance from the incoming vehicles and the waiting times of pedestrians [20]. In Egypt, the accepted gap size was studied at nine uncontrolled midblock sections, and a lognormal regression model was used to predict the minimum accepted gap. The model included different variables, including age, number of attempts, vehicle speed, road width, and rolling gap. The study suggested the gap size, the speed of the vehicles, rolling gap, and number of attempts affects that decision to cross the road [21].
In India, a regression model was developed to explain the gap acceptance behavior for pedestrians at a midblock section. The model included similar variables such as age, gender, pedestrian and vehicle speeds, crossing attempts, rolling gap, and crossing direction. The study revealed that the age, crossing speed, vehicle speed, and crossing direction affect the gap acceptance for pedestrians [22]. Another lognormal regression model was developed by the same authors to find the minimum gap. The model contained several variables such as yielding behavior of the driver, number of attempts, rolling gap, accepted lag or gap, and vehicle speed as independent variables. The study indicated that the rolling gap, yielding behavior of drivers, and number of attempts affect the crossing behavior for pedestrians [23]. In the United Kingdom, a fuzzy logic system was utilized to investigate the crossing behavior of pedestrians at a midblock crossing location. The study revealed that using fuzzy logic to explain the pedestrians' behavior is reliable and produced reasonable results when compared to the literature [24].
Moreover, some studies explored the probability of accepting a gap using logit or probit models according to different traffic and pedestrian characteristics [25,26]. However, there is generally limited use of microscopic simulation models in many of the existing pedestrian behavior studies. Agent-based modeling is one of the advanced microsimulation methods in terms of ability and flexibility [27]. This method utilizes agents as pedestrians to replicate different behaviors. These agents are capable of gathering information about the surrounding environment and then make decisions. The main advantage of agent-based modeling is the ability to allow the agents to develop their own decisions, which provides more realistic outcomes.
In summary, there is an increased interest in the development of simulation models to study pedestrian behavior in different schemes. However, there are limited simulation studies investigating the illegal crossing scenarios of pedestrians. For practical applications such as modeling pedestrian movements at unmarked midblock crossings, simulation techniques are well suited for this type of analysis and can provide realistic outcomes. The purpose of this study is to address this shortcoming by developing an agent-based microscopic model to investigate the gap acceptance behavior of crossing pedestrians considering various interaction behaviors among road users at unmarked midblock crossings. Different pedestrian attributes were investigated, including gender, age, type of clothing, carrying bags, using mobile phones, and crossing in a group.

Data Collection
The data for this study were collected at a six-lane divided urban section in Doha, Qatar. The site details are shown in Figure 1. The data were collected using four video cameras for two days. A total of 12 consecutive hours were recorded each day from 6:00 a.m. to 6:00 p.m. (total of 24 h). This time period was selected to ensure daylight condition and clear view when analyzing the videos. The pedestrian data were extracted to obtain 1094 observations for pedestrians crossing without a conflict (no vehicles were present when the pedestrian crossed the road) and 972 observations for pedestrians crossing with a conflict (some vehicles were present when the pedestrian crossed). Of the 972 observations, 602 pedestrians crossed the road using a perpendicular path, and 370 pedestrians crossed on an oblique path. This study focused on accepted gaps for pedestrians crossing in a perpendicular path. Different variables were extracted for each pedestrian crossing case. More details regarding the data collection process are available in the preliminary work of the authors [28]. Moreover, some studies explored the probability of accepting a gap using logit or probit models according to different traffic and pedestrian characteristics [25,26]. However, there is generally limited use of microscopic simulation models in many of the existing pedestrian behavior studies. Agent-based modeling is one of the advanced microsimulation methods in terms of ability and flexibility [27]. This method utilizes agents as pedestrians to replicate different behaviors. These agents are capable of gathering information about the surrounding environment and then make decisions. The main advantage of agent-based modeling is the ability to allow the agents to develop their own decisions, which provides more realistic outcomes.
In summary, there is an increased interest in the development of simulation models to study pedestrian behavior in different schemes. However, there are limited simulation studies investigating the illegal crossing scenarios of pedestrians. For practical applications such as modeling pedestrian movements at unmarked midblock crossings, simulation techniques are well suited for this type of analysis and can provide realistic outcomes. The purpose of this study is to address this shortcoming by developing an agent-based microscopic model to investigate the gap acceptance behavior of crossing pedestrians considering various interaction behaviors among road users at unmarked midblock crossings. Different pedestrian attributes were investigated, including gender, age, type of clothing, carrying bags, using mobile phones, and crossing in a group.

Data Collection
The data for this study were collected at a six-lane divided urban section in Doha, Qatar. The site details are shown in Figure 1. The data were collected using four video cameras for two days. A total of 12 consecutive hours were recorded each day from 6:00 a.m. to 6:00 p.m. (total of 24 h). This time period was selected to ensure daylight condition and clear view when analyzing the videos. The pedestrian data were extracted to obtain 1094 observations for pedestrians crossing without a conflict (no vehicles were present when the pedestrian crossed the road) and 972 observations for pedestrians crossing with a conflict (some vehicles were present when the pedestrian crossed). Of the 972 observations, 602 pedestrians crossed the road using a perpendicular path, and 370 pedestrians crossed on an oblique path. This study focused on accepted gaps for pedestrians crossing in a perpendicular path. Different variables were extracted for each pedestrian crossing case. More details regarding the data collection process are available in the preliminary work of the authors [28].   The gap was computed as the time for a critical vehicle to arrive at the point of crossing. The critical vehicle is described as the closest vehicle to the pedestrian. The location of the critical vehicle was recorded once a pedestrian started to cross the road. The critical vehicle can be positioned in any lane according to the crossing situation. The critical distance is defined as the distance from the crossing point to the critical vehicle. The speed of the critical vehicle was calculated using distance and time. The video data were played frame by frame to obtain the time taken by each vehicle to traverse a known trap length. The speed was then calculated using the difference in time between the vehicle passing the first line and the second line of the trap.
The pedestrian waiting time was measured from the time a pedestrian arrives at the curb or median until the crossing begins. The pedestrian speed was calculated based on crossing time and the traveled distance. Due to the presence of a median, separate observations were recorded for the case of crossing from the median to curb and vice versa.
In addition to the previous variables, variables related to pedestrians such as age, gender, type of clothing, crossing in a group, carrying bags, and use of mobile phones were collected as shown in Table 1. Pedestrian age was extracted from the videos based on visual appearance. To minimize the errors, the videos were processed by two different investigators. In case of any disagreement, both investigators discussed each case until reaching an agreement.

Agent-Based Pedestrian Model
An agent-based model (ABM) model was created in AnyLogic simulation software to replicate the crossing behavior of the pedestrians. AnyLogic supports different common simulation methodologies, such as system dynamics, discrete event, and agent-based modeling. In AnyLogic, the building blocks of the simulation are created in a virtual environment. Agents have a certain level of intelligence and control their autonomy of decision, thus resulting in behavior and outcomes that are more authentic in systems dependent on individual actions. In this study, an object-oriented and modular model

Agent-Based Pedestrian Model
An agent-based model (ABM) model was created in AnyLogic simulation software to replicate the crossing behavior of the pedestrians. AnyLogic supports different common simulation methodologies, such as system dynamics, discrete event, and agent-based modeling. In AnyLogic, the building blocks of the simulation are created in a virtual environment. Agents have a certain level of intelligence and control their autonomy of decision, thus resulting in behavior and outcomes that are more authentic in systems dependent on individual actions. In this study, an object-oriented and modular model was used. The model consisted of four modules, as indicated below. The modules follow the logic presented in Figure 2.

Environment Module
The environment module is a 2D continuous environment. It contains a three-lane road segment that has similar dimensions to the existing road.

Agent Module
In an agent module, some of the agents are used to replicate vehicles on the road. The main parameter for these agents is speed. The speed data of the vehicles were obtained from the video data. It was assumed that the vehicles do not change speed nor lane once placed in the environment. The arrival rate of the vehicles was set based on the traffic volumes collected from the video data. Vehicles were set to arrive at random times during the hour. A standard dimension of 3.5 m length and 2 m width was used for all vehicles. Several distributions were utilized to simulate the vehicle speeds.
On the other hand, the pedestrian agents were set to appear randomly based on a rate of arrival obtained from the field data. Once the crossing movement is complete, the pedestrian disappears from the model, and another pedestrian can emerge. The pedestrian crossing speeds used in the model were based on the speeds obtained from the video data. For the 602 observations included in the study, the minimum and maximum speeds were 0.96 and 4.73 m/s, respectively. The overall average speed was 1.73 m/s, with a standard deviation of 0.47 m/s.

Critical Gap Computation Module
The pedestrians are capable of determining the number of vehicles in the lanes in addition to the speed of each vehicle. Based on this information, the pedestrians decide if they may cross the road or not according to Equation (1): where D m is the minimum critical gap distance necessary for crossing safely, D l is the width that needs to be crossed ( If the vehicle is located further than Dm in all lanes, the crossing movement will be completed. If not, the decision will be to wait. This method did not accurately replicate the experimental results. Therefore, an error term was developed such that the pedestrian will 'think' the vehicle is moving at a higher speed than it actually is.

Graphical Interface
The simulation outputs for all interactions, including vehicle and pedestrian movements, are presented in a visual interface.

Model Calibration
To calibrate the simulation model, four types of distributions were investigated. The first distribution is the average speed. In this case, vehicles have a fixed speed of 48 km/h. This speed was decided based on the actual average speed obtained from the field data. The second distribution is the incremental speed: in this case, speeds started at 30 km/h then increased by 1 km until reaching 72 km/h. These are the limits collected from the actual data. The third distribution is uniform speed distribution. In this case, speeds vary based on a rectangular distribution. The values between the maximum and minimum vehicle speeds have the same probability. The fourth distribution is the normal distribution. In this case, vehicle speeds are set according to a truncated normal distribution. The truncation sets the minimum and maximum speeds to match the actual data. The truncation is Appl. Sci. 2020, 10, 4882 7 of 15 used as a safeguard against unreasonable values. The average and variance of the distribution are 48 and 8.8 km/h, respectively. Table 2 provides a comparison between the actual data and simulation results. It is worth noting that the results of each category (average, incremental, uniform, and normal) are the average of three simulation runs. It was found that the mean gap distance for all runs is less than that of the actual data. This result suggested that the model is underestimating the gap. Furthermore, the standard deviation of the simulation runs is larger than that of the actual data suggesting the simulation run data is more spread out than the actual data. This outcome suggests that the fit between the simulation data and actual data is not accurate. The data range, however, is similar to the actual data for most simulation runs. To determine which simulation runs closely fit the actual data, the gap data obtained from the simulation runs were ranked from highest to lowest and plotted against the actual data. Figure 3 illustrates that most of the runs underestimate the actual data. That is, the gap determined by the simulation is much narrower than that determined by pedestrians in reality. Visual inspection suggests that the normal distribution is the closest fit. The figure also illustrates the average error across all points, absolute error (modulus of average errors), and the confidence interval of the average error.
To enhance the fit, the speed in the normal distribution runs was multiplied by a positive factor. This factor accounts for the underestimation of the simulation. Figure 4 illustrates the results of introducing different factors of 1.25, 1.5, and 1.75. In reality, the factored runs simulate a pedestrian who is overestimating the vehicle speed as it approaches. The figure also illustrates the error of the factored simulations. As shown in the figure, from the absolute error, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. The results indicate that pedestrians include a risk factor ranging from 1.25 to 1.5 to the speed of the vehicle speeds before deciding to cross or not.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 15 maximum and minimum vehicle speeds have the same probability. The fourth distribution is the normal distribution. In this case, vehicle speeds are set according to a truncated normal distribution. The truncation sets the minimum and maximum speeds to match the actual data. The truncation is used as a safeguard against unreasonable values. The average and variance of the distribution are 48 and 8.8 km/h, respectively. Table 2 provides a comparison between the actual data and simulation results. It is worth noting that the results of each category (average, incremental, uniform, and normal) are the average of three simulation runs. It was found that the mean gap distance for all runs is less than that of the actual data. This result suggested that the model is underestimating the gap. Furthermore, the standard deviation of the simulation runs is larger than that of the actual data suggesting the simulation run data is more spread out than the actual data. This outcome suggests that the fit between the simulation data and actual data is not accurate. The data range, however, is similar to the actual data for most simulation runs. To determine which simulation runs closely fit the actual data, the gap data obtained from the simulation runs were ranked from highest to lowest and plotted against the actual data. Figure 3 illustrates that most of the runs underestimate the actual data. That is, the gap determined by the simulation is much narrower than that determined by pedestrians in reality. Visual inspection suggests that the normal distribution is the closest fit. The figure also illustrates the average error across all points, absolute error (modulus of average errors), and the confidence interval of the average error.  To enhance the fit, the speed in the normal distribution runs was multiplied by a positive factor. This factor accounts for the underestimation of the simulation. Figure 4 illustrates the results of introducing different factors of 1.25, 1.5, and 1.75. In reality, the factored runs simulate a pedestrian who is overestimating the vehicle speed as it approaches. The figure also illustrates the error of the factored simulations. As shown in the figure, from the absolute error, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. The results indicate that pedestrians include a risk factor ranging from 1.25 to 1.5 to the speed of the vehicle speeds before deciding to cross or not.    To enhance the fit, the speed in the normal distribution runs was multiplied by a positive factor. This factor accounts for the underestimation of the simulation. Figure 4 illustrates the results of introducing different factors of 1.25, 1.5, and 1.75. In reality, the factored runs simulate a pedestrian who is overestimating the vehicle speed as it approaches. The figure also illustrates the error of the factored simulations. As shown in the figure, from the absolute error, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. The results indicate that pedestrians include a risk factor ranging from 1.25 to 1.5 to the speed of the vehicle speeds before deciding to cross or not.

Gender
The data was investigated based on gender to identify any difference in perception between male and female pedestrians. It was found that the normal distribution runs for the male pedestrians with a factor of 1.5 resulted in the lowest absolute error, followed by a factor of 1.25, as shown in Figure 5. From the absolute error for the female pedestrians, the normal distribution runs with a factor of 1 gave the best results, followed by a factor of 1.25.
The data was investigated based on gender to identify any difference in perception between male and female pedestrians. It was found that the normal distribution runs for the male pedestrians with a factor of 1.5 resulted in the lowest absolute error, followed by a factor of 1.25, as shown in Figure 5. From the absolute error for the female pedestrians, the normal distribution runs with a factor of 1 gave the best results, followed by a factor of 1.25. Comparing the male and female results, the results may infer that the female pedestrians are more prone to take risks when compared with the men in this study. Another interpretation can be that females have a better ability to discern the vehicle speed and thus, the critical gap. A third hypothesis is that females expect drivers will slow down for them and that it is the responsibility of the driver to respond to the pedestrian when crossing. It is worth noting that the female sample in the study is small, as presented in Table 1, which may affect the reliability of these conclusions. This small female sample is normal and expected in this region, as indicated by previous research [29][30][31].

Age
The pedestrian age group was estimated according to observer judgment. To minimize errors, three broad groups were used: old-age, middle-age, and children. From the absolute error for the old pedestrians, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for both old and middle-age pedestrians, as shown in Figure 6. It appears that there is no significant difference in the perception of old and middle-age pedestrians. It should be noted that the sample size of old-age pedestrians is small when compared to the middle-age pedestrians, as shown in Table 1, which is expected in the region [29][30][31].  Comparing the male and female results, the results may infer that the female pedestrians are more prone to take risks when compared with the men in this study. Another interpretation can be that females have a better ability to discern the vehicle speed and thus, the critical gap. A third hypothesis is that females expect drivers will slow down for them and that it is the responsibility of the driver to respond to the pedestrian when crossing. It is worth noting that the female sample in the study is small, as presented in Table 1, which may affect the reliability of these conclusions. This small female sample is normal and expected in this region, as indicated by previous research [29][30][31].

Age
The pedestrian age group was estimated according to observer judgment. To minimize errors, three broad groups were used: old-age, middle-age, and children. From the absolute error for the old pedestrians, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for both old and middle-age pedestrians, as shown in Figure 6. It appears that there is no significant difference in the perception of old and middle-age pedestrians. It should be noted that the sample size of old-age pedestrians is small when compared to the middle-age pedestrians, as shown in Table 1, which is expected in the region [29][30][31].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 15 The data was investigated based on gender to identify any difference in perception between male and female pedestrians. It was found that the normal distribution runs for the male pedestrians with a factor of 1.5 resulted in the lowest absolute error, followed by a factor of 1.25, as shown in Figure 5. From the absolute error for the female pedestrians, the normal distribution runs with a factor of 1 gave the best results, followed by a factor of 1.25. Comparing the male and female results, the results may infer that the female pedestrians are more prone to take risks when compared with the men in this study. Another interpretation can be that females have a better ability to discern the vehicle speed and thus, the critical gap. A third hypothesis is that females expect drivers will slow down for them and that it is the responsibility of the driver to respond to the pedestrian when crossing. It is worth noting that the female sample in the study is small, as presented in Table 1, which may affect the reliability of these conclusions. This small female sample is normal and expected in this region, as indicated by previous research [29][30][31].

Age
The pedestrian age group was estimated according to observer judgment. To minimize errors, three broad groups were used: old-age, middle-age, and children. From the absolute error for the old pedestrians, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for both old and middle-age pedestrians, as shown in Figure 6. It appears that there is no significant difference in the perception of old and middle-age pedestrians. It should be noted that the sample size of old-age pedestrians is small when compared to the middle-age pedestrians, as shown in Table 1, which is expected in the region [29][30][31].

Type of Clothing
To compare the behavior of the pedestrians based on their nationality, the type of clothing was as a proxy to identify the nationality of the pedestrian. The local Qataris wear a traditional dress that is easy to identify, while expatriates dress in regular clothes [32]. From the absolute error for pedestrians with traditional clothing, the normal distribution runs with a factor of 1.25 gave the best results, followed by a factor of 1.5, as shown in Figure 7. From the absolute error for the pedestrians with normal clothing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. It appears that pedestrians with traditional clothing are more likely to add a factor less than 1.25, while pedestrians with normal clothing are more likely to add a factor greater than 1.25. In other words, Qataris are taking higher risks. This can be attributed to the social status gained by most Qatari pedestrians and, thus, the expectation that the drivers will take the appropriate measures to avoid a crash. The difference, however, is very slight, and thus a confident interpretation is difficult to assert.

Type of Clothing
To compare the behavior of the pedestrians based on their nationality, the type of clothing was as a proxy to identify the nationality of the pedestrian. The local Qataris wear a traditional dress that is easy to identify, while expatriates dress in regular clothes [32]. From the absolute error for pedestrians with traditional clothing, the normal distribution runs with a factor of 1.25 gave the best results, followed by a factor of 1.5, as shown in Figure 7. From the absolute error for the pedestrians with normal clothing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. It appears that pedestrians with traditional clothing are more likely to add a factor less than 1.25, while pedestrians with normal clothing are more likely to add a factor greater than 1.25. In other words, Qataris are taking higher risks. This can be attributed to the social status gained by most Qatari pedestrians and, thus, the expectation that the drivers will take the appropriate measures to avoid a crash. The difference, however, is very slight, and thus a confident interpretation is difficult to assert.

Carrying Bags
Nearly 20% of pedestrians were seen carrying bags or luggage while crossing the road. From the absolute error, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for pedestrians with and without bags before anticipating the gap, who then decide to complete the crossing or not, as shown in Figure 8. The results suggest that there is no difference in perception between pedestrians carrying bags or not.

Carrying Bags
Nearly 20% of pedestrians were seen carrying bags or luggage while crossing the road. From the absolute error, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for pedestrians with and without bags before anticipating the gap, who then decide to complete the crossing or not, as shown in Figure 8. The results suggest that there is no difference in perception between pedestrians carrying bags or not.

Type of Clothing
To compare the behavior of the pedestrians based on their nationality, the type of clothing was as a proxy to identify the nationality of the pedestrian. The local Qataris wear a traditional dress that is easy to identify, while expatriates dress in regular clothes [32]. From the absolute error for pedestrians with traditional clothing, the normal distribution runs with a factor of 1.25 gave the best results, followed by a factor of 1.5, as shown in Figure 7. From the absolute error for the pedestrians with normal clothing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25. It appears that pedestrians with traditional clothing are more likely to add a factor less than 1.25, while pedestrians with normal clothing are more likely to add a factor greater than 1.25. In other words, Qataris are taking higher risks. This can be attributed to the social status gained by most Qatari pedestrians and, thus, the expectation that the drivers will take the appropriate measures to avoid a crash. The difference, however, is very slight, and thus a confident interpretation is difficult to assert.

Carrying Bags
Nearly 20% of pedestrians were seen carrying bags or luggage while crossing the road. From the absolute error, the normal distribution runs with a factor of 1.5 gave the best results followed by a factor of 1.25 for pedestrians with and without bags before anticipating the gap, who then decide to complete the crossing or not, as shown in Figure 8. The results suggest that there is no difference in perception between pedestrians carrying bags or not.

Using Mobile Phones
Using mobile phones for texting or talking while crossing is a major concern for traffic safety. The data indicated that a few pedestrians were using a mobile phone while crossing the road (2%). From the absolute error for the cases of using or not using a mobile phone while crossing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25 for both cases, as shown in Figure 9. The average error suggests that there is no difference in perception between pedestrians using a phone while crossing or not.

Using Mobile Phones
Using mobile phones for texting or talking while crossing is a major concern for traffic safety. The data indicated that a few pedestrians were using a mobile phone while crossing the road (2%). From the absolute error for the cases of using or not using a mobile phone while crossing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25 for both cases, as shown in Figure 9. The average error suggests that there is no difference in perception between pedestrians using a phone while crossing or not.

Crossing in a Group
The pedestrians' behavior was investigated to find whether each pedestrian is crossing individually or accompanied by other pedestrians. This approach was adopted to investigate if the perception changes when crossing in a group. As shown in Figure 10, the average error suggests that pedestrians in a group are likely to add a factor less than 1.25 to the critical gap as opposed to a factor greater than 1.25 if the pedestrian is alone. This can be attributed to a higher sense of safety when crossing as a group.

Crossing in a Group
The pedestrians' behavior was investigated to find whether each pedestrian is crossing individually or accompanied by other pedestrians. This approach was adopted to investigate if the perception changes when crossing in a group. As shown in Figure 10, the average error suggests that pedestrians in a group are likely to add a factor less than 1.25 to the critical gap as opposed to a factor greater than 1.25 if the pedestrian is alone. This can be attributed to a higher sense of safety when crossing as a group.

Using Mobile Phones
Using mobile phones for texting or talking while crossing is a major concern for traffic safety. The data indicated that a few pedestrians were using a mobile phone while crossing the road (2%). From the absolute error for the cases of using or not using a mobile phone while crossing, the normal distribution runs with a factor of 1.5 gave the best results, followed by a factor of 1.25 for both cases, as shown in Figure 9. The average error suggests that there is no difference in perception between pedestrians using a phone while crossing or not.

Crossing in a Group
The pedestrians' behavior was investigated to find whether each pedestrian is crossing individually or accompanied by other pedestrians. This approach was adopted to investigate if the perception changes when crossing in a group. As shown in Figure 10, the average error suggests that pedestrians in a group are likely to add a factor less than 1.25 to the critical gap as opposed to a factor greater than 1.25 if the pedestrian is alone. This can be attributed to a higher sense of safety when crossing as a group.

Kolmogorov-Smirnov Test
To statistically test the results, a Kolmogorov-Smirnov test was used. This test is a nonparametric goodness-of-fit test used to test the hypothesis that two populations have the same distribution. Let x 1 , . . . , x m be the observations from the simulation of the first group with cumulative distribution function (CDF) F 1 , and let y 1 , . . . , y n be the observations from the simulation of the second group with CDF F 2 . The null hypothesis is presented in Equation (2).
The Kolmogorov-Smirnov test statistic is defined in Equation (3).
The hypothesis is rejected if the test statistic, D, is greater than the Kolmogorov-Smirnov critical value calculated using the functions presented in Equations (4)-(6) [33,34].
where S(D) = level of significance, N e = effective number of data points, N 1 , N 2 = number of data points in the two distributions, and Q KS = monotonic function for computing significance level.
The results are presented in Table 3. As the computed p-value is greater than the significance level p-value of 0.05, the null hypothesis H 0 cannot be rejected in the case of age, type of clothing, carrying bags, and using mobile phones at the 95% confidence level.

Conclusions
An ABM was developed to model different pedestrian attributes at an unmarked midblock crossing. The study confirmed that the ABM is capable of generating pedestrian movement profiles that match reality to an extent. The model developed provided an accurate and robust duplication of the actual pedestrian motion. Different pedestrian attributes were investigated, including gender, age, type of clothing, carrying bags, using mobile phones, and crossing in a group.
The results showed that pedestrians add a positive risk factor to the speed of approaching vehicles before evaluating a gap, then proceed with the crossing decision. This factor was smaller in the case of female pedestrians, which may imply that they are more prone to taking risks during crossing when compared with men. Another explanation can be that they have a better ability to discern vehicle speed and thus assess the critical gap. The factor was also smaller for pedestrians crossing in a group, which can be an indication that pedestrians have a higher sense of safety when crossing as a group. The analysis also suggested that there is no difference in perception between old and middle-age pedestrians, pedestrians carrying bags or not, and pedestrians using a mobile phone while crossing or not. It should be noted that the risk factors identified in this study are applicable only to the conditions studied. However, the methodology can be replicated to identify the risk factors for any other conditions. Identifying these factors for different pedestrian attributes can play a major role in several applications, such as testing and validating autonomous driving systems and advanced driver assistance systems. It can also help researchers, public agencies, and decision-makers better understand the behavior of pedestrians and compare scenarios related to pedestrian treatments in urban areas. Taking the planning and design of uncontrolled intersections as an example, the risk factor can be used to study and compare different alternatives to select the most perceived safe type of uncontrolled midblock crossing by the pedestrians. The outcome can also be useful in developing more reliable simulation models. Furthermore, the study provides guidance for educational programs and driver training. The results indicate that some pedestrians take more risks than others. It is, therefore, necessary to develop specific training programs aimed at pedestrians facing higher risks to increase their awareness and promote safer road crossing practices.
Despite its numerous merits, some study limitations should be noted. The amount of data collected for various subsets of the demographics was low. However, the low percentage of children, locals, and female pedestrians in the sample is common in the region due to unique demographics, social characteristics, and infrastructure conditions [35,36]. In addition, an assumption was made that the pedestrians move at constant speeds, which does not affect the average time required to cross. This assumption was based on the low percentage of pedestrians changing speeds in this dataset. However, future work should attempt to consider more realistic pedestrian movements by capturing the time dynamics of occupancy and spatial density for pedestrians, which may vary in other conditions [37,38]. Furthermore, in this study, the pedestrian attributes were extracted from the videos based on visual appearance. To minimize the errors, using a combination of video recording and intercept interviews in the field should be considered. As part of the intercept interviews, illegal pedestrians will be stopped for brief interviews to verify some personal attributes such as age.
Future studies could focus on increasing the accuracy of the risk factors. For example, in this study, three simulation runs were conducted, and the average was used for each scenario. Increasing this number to 10 or more runs may improve the accuracy of the results. In addition, the risk factors investigated in this study were determined based on testing specific values for the factors. Investigating other statistical methods such as computing the residual error by conducting the least-square method using the results of the observation and the simulations may result in more accurate factors. Additionally, some variables were not considered in the study, such as vehicle types and weather conditions. These factors were not meaningful because the data collection process was conducted in good weather conditions and on a road segment where heavy trucks are not allowed. However, the effect of these variables should be investigated in the future.
Funding: This publication was made possible by an NPRP award [NPRP 4-1170-2-456] from the Qatar Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the author.