Methodology for Identifying Optimal Pedestrian Paths in an Urban Environment: A Case Study of a School Environment in A Coruña, Spain

: Improving urban mobility, especially pedestrian mobility, is a current challenge in virtually every city worldwide. To calculate the least-cost paths and safer, more efficient routes, it is necessary to understand the geometry of streets and their various elements accurately. In this study, we propose a semi-automatic methodology to assess the capacity of urban spaces to enable adequate pedestrian mobility. We employ various data sources, but primarily point clouds obtained through a mobile laser scanner (MLS), which provide a wealth of highly detailed information about the geometry of street elements. Our method allows us to characterize preferred pedestrian-traffic zones by segmenting crosswalks, delineating sidewalks, and identifying obstacles and impediments to walking in urban routes. Subsequently, we generate different displacement cost surfaces and identify the least-cost origin–destination paths. All these factors enable a detailed pedestrian mobility analysis, yielding results on a raster with a ground sampling distance (GSD) of 10 cm/pix. The method is validated through its application in a case study analyzing pedestrian mobility around an educational center in a purely urban area of A Coruña (Galicia, Spain). The segmentation model successfully identified all pedestrian crossings in the study area without false positives. Additionally, obstacle segmentation effectively identified urban elements and parked vehicles, providing crucial information to generate precise friction surfaces reflecting real environmental conditions. Furthermore, the generation of cumulative displacement cost surfaces allowed for identifying optimal routes for pedestrian movement, considering the presence of obstacles and the availability of traversable spaces. These surfaces provided a detailed representation of pedestrian mobility, highlighting significant variations in travel times, especially in areas with high obstacle density, where differences of up to 15% were observed. These results underscore the importance of considering obstacles’ existence and location when planning pedestrian routes, which can significantly influence travel times and route selection. We consider the capability to generate accurate cumulative cost surfaces to be a significant advantage, as it enables urban planners and local authorities to make informed decisions regarding the improvement of pedestrian infrastructure.


Introduction
The enhancement of urban pedestrian mobility stands as one of the Sustainable Development Goals (SDGs) outlined in the United Nations' 2030 Agenda [1], specifically goal 11: "Make cities inclusive, safe, resilient and sustainable".Urban mobility adaptation is directed towards enhancing the efficiency and safety of pedestrian travel, a particularly vulnerable group in urban settings.Both children and pedestrians over the age of 65 are considered vulnerable people, as indicated in studies such as Agarwala and Vasudevan (2020) [2], Lord et al. (2018) [3], and Campisi et al. (2022) [4].In this context, the daily commute of students to educational institutions represents one of the most critical scenarios in terms of road safety because its main participants are children, often accompanied by elderly people such as grandparents, as shown in Fernández-Arango et al. (2022) [5].School entry and exit times are particularly critical moments as they often result in significant traffic congestion around educational centers, corroborating the results of studies like Deluka-Tibljaš et al. (2021) [6], which demonstrate that school environments are among the areas in cities where the most children are injured.In this regard, it is evident that studying road infrastructure for people with reduced mobility is crucial due to the variations in users' perceptions and experiences depending on the street environment.Streets with wider or pedestrian-specific sidewalks offer a safer and more pleasant walking environment than narrow sidewalks, reducing the likelihood of accidents and facilitating traffic flow.Obstacles such as street furniture and vehicles parked around the sidewalks significantly complicate mobility.
Therefore, possessing a detailed understanding of the routes students utilize is paramount, as it enables the formulation of appropriate mobility policies tailored to the urban environment of each school.This entails comprehending the characteristics of roads, their geometry, urban furnishings, and equipment, and the mobility potential of each sidewalk and public space that may be utilized in pedestrian routes.
Having data on student concentration in specific areas or on particular routes provides fundamental information for planning efficient pedestrian pathways.School administrators and educational authorities possess information regarding students' residences, and while the calculation of optimal travel routes is one of the most valuable tools of a GIS (Collischon et al., 2000) [7], the difficulty in analyzing this information alongside other data sources can entail complex georeferencing processes and specialized personnel for its acquisition and analysis.In this regard, the decision-making process for the design of school pedestrian routes in Spain is generally carried out considering students' residences based on manually recorded information in their enrollment forms, typically prioritizing proximity to the school, as evidenced in initiatives such as Camino Escolar Seguro, Stop Accidentes, Caminos Escolares, Teachers for Future Spain, and International Walk to School [8][9][10][11][12].
Most related studies we have reviewed consider factors such as slope (Hosseini et al., 2023) [13] to identify optimal pedestrian routes.Some, like Massin et al. (2022) [14], analyze the influence of obstacles present in routes, while González-Collado et al. (2024) [15] employ a combination of data acquired through an MLS and HMLS (handheld mobile laser scanner) to identify multiple elements existing along two kilometers of urban roadway.Incorporating obstacles in urban spaces seems necessary in studies of pedestrian mobility in cities.
This work highlights that traditional approaches based solely on origin-destination proximity and terrain inclination are insufficient to guarantee efficient pedestrian routes.Incorporating detailed data on obstacles' locations and free walking space availability is crucial for optimizing pedestrian mobility.In this regard, our work enables combining data from mobile laser scanning (MLS) with certain AI and GIS techniques that have demonstrated great capacity to understand pedestrian mobility in urban environments.The methodology developed has allowed the identification of obstacles to characterize in detail the most walkable pedestrian crossing areas on sidewalks and crosswalks, significantly improving the accuracy of studies based solely on origin-destination distance.Our method can generate cumulative travel cost surfaces and identify the least-cost routes in any urban area using a set of raster layers with a 10 cm/pixel spatial resolution.The results demonstrate that pedestrian-specific zones and streets with wider and more open spaces significantly improve city pedestrian mobility.
To validate the method, a case study has been conducted in the urban environment of A Coruña, analyzing a total of 3960.66 m and calculating different accumulated displacement cost surfaces and least-cost paths from 30 student households to the educational center itself.
This paper is organized as follows: Section 2 describes the study area, the materials used, and the proposed methodology.Section 3 presents the results obtained.Section 4 discusses the results, and Section 5 describes the study's main findings.

Test Site and Method Overview
Although the proposed method is general enough to be applied in any urban environment, it specifically focuses on pedestrian mobility analysis in urban areas.For this reason, ten streets in the city of A Coruña (Galicia, Spain) were selected as a case study, characterized by a purely urban structure consisting of one or multiple lanes, curbs, sidewalks, and buildings.Additionally, nearly all streets had parked vehicles and a significant amount of urban furniture, adding complexity to the study and making LiDAR point cloud segmentation and accessibility analysis more challenging.Table 1 describes some metrics of the analyzed streets, while Figure 1 depicts the study area.To generate the pedestrian accessibility model, four general phases were followed, as outlined in Figure 2: Phase 1 involved creating initial information surfaces with parameters influencing urban pedestrian mobility, derived from data acquired from various sources (LiDAR, OSM, Cadastre . . .).Phase 2 consisted of generating different friction surfaces for displacement, where each cell represents a unit cost of movement.Phase 3 involved generating an accumulated displacement cost surface, where each cell represents the travel time to a destination point.Finally, phase 4 encompassed model validation tests.

Pedestrian Mobility Zone: Sidewalks and Crosswalks
With the aim of promoting safe pedestrian movement, we adjusted the space to be analyzed for pedestrian transit, restricting walking to specific areas designated for pedestrians.To achieve this, a region composed of sidewalks and crosswalks was generated, forming the basis for the rest of the friction surfaces.
The delineation of sidewalks was achieved by rasterizing vector cartography from the A Coruña City Council with a GSD of 10 cm/pix and contrasting its accuracy with information obtained from OpenStreetMap and the Spanish Cadastre Website.
The segmentation and geolocation of crosswalks were performed automatically using AI techniques that allowed training of the computer vision model YOLOv8m-seg.For this purpose, images generated from intensity values of LiDAR point clouds were employed, previously processed through the MLS point cloud processing pipeline prepared for this study and available at github-dfarango.

Crosswalk Dataset and Model Training
The first step in training the model was to label the crosswalks, distinguishing between two types, crosswalk A and crosswalk B, as their behavior in computer vision varies significantly.Figure 3 illustrates an example of each type of crosswalk.For labeling, the web application Roboflow was utilized [16].The generated dataset can be viewed and downloaded at Roboflow_Universe/dfarango, and its metrics are presented in Table 2.With this dataset, the YOLOv8m-seg model from Ultralytics was trained on the Kaggle platform, which allows the use of customized Jupyter Notebooks and limited free access to GPUs.Table 3 presents the hyper-parameter values used to train the YOLOv8 model.The complete code for this training and an explanation of the selected parameters can be viewed and downloaded at Kaggle/davidfarango.

Validation of the Segmenter
To validate the system's performance, some of the most common metrics in object detection and segmentation were employed: precision, recall, F1-score, and mean average precision (mAP).These metrics are related to the concept of IoU (intersection over union), which is utilized to quantify the degree of overlap between the predicted boundary and the ground truth (Figure 4).In our dataset, like in many others, a pre-defined 0.5 IoU threshold is set to classify whether a prediction is deemed a true positive or a false positive.Precision (1) measures how accurate the prediction is, and it represents the ratio of true positives to all predicted positives, while recall (2) measures how well all positives are found.The F1-score (3) is calculated as the harmonic mean of the precision and recall scores, and mean average precision (mAP) provides a joint analysis of precision and recall, indicating the average precision values for all recall values between 0 and 1, where AP (average precision) is the area found under the precision-recall curve (4) 2020), SciKit Learn, mAP for Object Detection, and Sánchez-Alor (2020) [17][18][19][20][21][22].
where TP = true positive, FP = false positive, and FN = false negative.

Inferences and Results
Once the model was trained, inference was performed on the 61 images using the neural network, with the weights adjusted during training, to obtain the segmentation of the crosswalks.Subsequently, to transfer these inferences to a GIS, pixel values were reclassified, assigning a value of '1' to pixels belonging to crosswalks (both type A and type B) and '0' to the rest of the pixels in the image.The script that performs this process is indexReclass.py,which generates a normalized index using the Red and Green bands (I = (R − G)/(R + G)) and reclassifies all values other than 0 to a value of '1'.

DTM and Obstacles
To obtain a digital terrain model (DTM) of the study area and the location of all obstacles hindering pedestrian movement, the algorithm proposed in a previous study (Fernández-Arango et al., 2022) [23] was employed.This algorithm allowed the generation of a DTM raster from LiDAR point clouds acquired through MLS.Initially, the algorithm segmented the LiDAR point cloud into ground points and non-ground points.Subsequently, a DTM with a GSD of 10 cm/pix was generated, with each pixel value representing the mean elevation of the points within each cell of the cloud.Additionally, interpolation was employed for occluded areas to ensure a continuous DTM along the entire street.
A digital surface model (DSM) of the entire street was generated for the detection and geolocation of permanent obstacles, and height-above-ground (HAG) values were calculated for the entire point cloud.Any group of points with HAG values ranging between 25 and 220 cm was considered a pedestrian impedance obstacle.

Pedestrian Segmentation
While some segmentation tasks can be reduced to a 2D problem (e.g., segmentation of crossing zones), differentiating pedestrians from other objects in urban contexts demands an accurate 3D characterization due to the extensive variety of objects (e.g., street lights, urban trees, or garbage containers).Thus, we formulated pedestrian segmentation as a binary classification problem where each point must be classified as pedestrian or nonpedestrian.More concretely, we computed multiscale features from the 3D point cloud to train a random forest model for binary point-wise classification (Thomas et al., 2018;Weinmann et al., 2017) [24,25].Finally, we generated a GeoTIFF raster with a cell size of 10 cm from the point cloud that is well suited for our later path-finding algorithm.Each cell in the raster can be seen as a binary mask that specifies whether there are pedestrians or not.All the previous computations were carried out using the open-source VirtuaLearn3D framework [26] for artificial intelligence applied to point clouds.
For the computation of the multiscale geometric features, we considered spherical neighborhoods with the following radii: 12.5 cm, 25 cm, 50 cm, 75 cm, 1 m, 2 m, 3 m, and 5 m.For each spherical neighborhood, we computed the linearity, planarity, sphericity, surface variation, omnivariance, verticality, anisotropy, eigenentropy, and the sum of eigenvalues.
We also computed the distance to the lowest point (floor distance) considering 2D rectangular neighborhoods on the horizontal plane centered on support points uniformly distributed inside the smallest axis-aligned bounding box containing the point cloud.In this way, we considered cell sizes of 10 m and 50 m.As we used MLS point clouds in our study, we also considered the available spectral information (intensity) and the scan angle associated with each point.These two features, together with the two point-wise floor distances, were transformed by mean-based smooth filters in spherical neighborhoods with radii of 25 cm, 1 m, and 3 m.
Due to the large number of features (88), we conducted ablation studies to determine to what extent the features are necessary.First, we evaluated the model trained without smooth features.Then, we evaluated the model trained without smooth features and raw height features, i.e., considering geometric features only.Finally, we considered all the features but iteratively discarded the geometric features of the maximum radius.As the biggest neighborhoods lead to the most significant computational burden, this last ablation study effectively analyzes the upper bound of data mining regarding computational cost.All the evaluations for the ablation studies were carried out through stratified k-folding (with k = 5).
We opted for the active learning paradigm to train our random forest model because our point clouds are not labeled (Yang and Loog, 2016) [27].With this paradigm, the first iteration requires a manually segmented initial budget.After this first iteration, we measured the point-wise entropy as defined in Equation (5).In this equation, e i refers to the entropy of the i-th point, n c is the number of classes (in our case n c = 2), and z ij is the probability that the i-th point belongs to the j-th class.Subsequent iterations were focused on extending the training dataset with information from high-entropy regions, enabling the training of the model with just a few hours of work from the oracle, i.e., the human that reviews high-entropy points to decide whether they must be labeled as a pedestrian or not.
We applied hyperparameter tuning with stratified k-folding (using k = 5) to train our models for two reasons: (1) to avoid the typical active learning pitfall of neglecting the classifier's configuration (Lüth et al., 2023) [28]; and ( 2) to analyze the model generalization in terms of its mean accuracy and standard deviation on different data splits.More concretely, we applied hyperparameter tuning twice.Once with the initial budget and once with the final training dataset.Our experiments considered the number of binary decision trees, the maximum depth for each decision tree, and the class weight strategy as hyperparameters for grid search optimization.
The number of decision trees is an essential hyperparameter because it significantly impacts the compromise between the robustness against overfitting and the execution time (especially the time for model training).The maximum depth of each tree is another relevant hyperparameter.On the one hand, too much depth means the model might become prone to overfitting.On the other hand, insufficient depth means the model cannot exploit interactions between features to achieve a successful classification.For the class weight strategy, we considered uniform weights (all classes have weight one) and a balanced alternative, where each class is weighted such that w i = n s (n c • n i ) −1 .In this strategy, w i is the weight for the i-th class, n c is the number of classes, n s is the number of samples used to train a given decision tree, and n i is the number of samples belonging to the i-th class on the split of data used to train a particular decision tree.At most, a randomly selected 30% of the training dataset is considered for each decision tree.

Better-Walkability Areas
For a pedestrian, walking on a wide sidewalk free of obstacles is more pleasant, safe, and efficient in terms of cost-time of displacement than walking on a sidewalk full of obstacles that must be avoided.For this reason, we added to the study a surface of betterwalkability areas, prioritizing the widest and obstacle-free passage zones.The approach was based on the notion that regions farthest from obstacles should have a lower movement cost than those closest to them.Additionally, higher cost values were assigned to areas near building walls and at the edges of sidewalks with roadways, as pedestrians tend to avoid these regions to avoid potential obstacles on building facades or parked vehicles.
A raster was calculated to generate this friction surface for movement, with pixels having different penalty time values based on their proximity to one or more obstacles.Table 4 presents these penalty times.Through the joint analysis of all previously generated friction surfaces, several accumulated displacement cost surfaces were constructed.In these surfaces, each cell represents the total cost value to travel from that cell to the destination cell, moving along the optimal or least-cost displacement route.To generate these surfaces, Equation ( 6), proposed by Langmuir (1984) [29], was utilized, allowing for the estimation of displacement values based on the terrain slope, and subsequently, the rest of the friction surfaces were added.
T represents the time spent on the journey, ∆S the distance traveled, ∆H 1 the vertical distance traveled uphill, ∆H 2 the vertical distance traveled downhill with moderate slope (5-12 • ), and ∆H 3 the vertical distance traveled downhill on a steep slope (>=12 • ).The constants used are a = time (s) needed to walk 1 m on a horizontal surface, b = additional time (s) per meter of uphill elevation gain, c = additional time (s) per meter of elevation loss on moderate downhill slopes, and d = additional time (s) per meter of elevation loss on steep downhill slopes.The values assigned to these constants were those proposed by Langmuir (1984) [29] for the constants b = 6.0, c = 1.9998, and d = −1.9998.However, a was assigned as 0.75 based on more recent studies such as Bosina [30][31][32][33], which better represent the walking speeds of all types of pedestrians.
To implement the remaining friction surfaces in the analysis and calculate the accumulated displacement cost for each cell, Equation ( 7) was utilized and implemented in the algorithm Rwalk [34].
where λ = 1 is a dimensionless scaling factor of the friction surface.
To validate the model, pedestrian accessibility tests were conducted in five different scenarios, varying the pedestrian's mobility flexibility in each case study.Thirty student residences were randomly chosen as origin points, with the educational center Fogar de Sta.Margarida in A Coruña selected as the destination.The following tests were conducted: The friction surfaces from tests 2 and 3 were used, and another one of betterwalkability areas concerning obstacles and pedestrians was added, using the temporal penalties described in Table 4.
In Table 5, the friction surfaces used in each of the conducted tests are summarized.

Pedestrian Mobility Zone: Sidewalks and Crosswalks
Of the streets studied (Figure 1), all necessary pedestrian crossings were successfully identified to determine pedestrian routes.For validating the model inferences, the obtained segmentations were compared with a ground truth of the study region, which had previously been utilized in other works such as Fernández-Arango et al. (2022) and Esmorís et al. (2023) [23,35].A total of 40 pedestrian crossings were identified, corresponding to 10 streets in the vicinity of Fogar de Sta.As depicted in Figure 5, Margarida school was manually digitized and compared with the number of crossings segmented by YOLOv8m-seg against the ground truth.The aim of this comparison was to validate the model, as there is an interest in extending the study to other areas lacking pedestrian crossing data.The algorithm successfully detected all pedestrian crossings in the study area without yielding any false positives.Additionally, the total surface area of all segmented pedestrian crossings was compared with that of the digitized ones.In this case, the total surface area of the digitized crossings was 1941.81 m 2 compared to 2477.89 m 2 of the crossings segmented by the model, resulting in a 27.61% larger segmented area.This ensured, for our study, that no potential pedestrian route space was overlooked.Figure 6 provides a detailed comparison of some segmented pedestrian crossings and their corresponding ground truths.To validate the effectiveness of the pedestrian crossing segmenter, precision-recall (P-R) and F1-confidence values were analyzed.Figure 7 illustrates the model's performance based on the P-R and F1-confidence graphs.As observed, for crosswalk class A, an identification value of 0.904 was obtained, meaning that when the model predicted the presence of crosswalk A, 90.4% of those predictions were correct relative to all predictions of crosswalk A made by the model.For crosswalk class B, the identification value was even higher, at 99.5%.This indicates exceptionally high performance by the model in predicting the presence of crosswalks in class B, with a very high precision rate.For all classes combined, the identification value was 94.9% for an IoU threshold of 0.5, thus indicating that, overall, the model achieved high precision in detecting objects in all classes considered together.These results demonstrate that the model can accurately identify both types of crosswalks, although there may be variations in performance between individual classes.The high precision in detecting crosswalks in class B is particularly notable, suggesting that the model is particularly effective in identifying this specific class of objects.
Moreover, the YOLOv8 segment model has been pre-trained on a wide variety of 80 different objects like bicycles, cars, dogs, backpacks, potted plants, fire hydrants, or traffic lights.Thus, the model weights should be well suited to differentiate between many types of objects, which is expected to lead to successful transfer learning when specializing the model to another task.The features that are automatically extracted by the neural network should be unbiased due to the significant diversity of classes involved in the pre-training process.
The F1-confidence curve provides insights into the model's ability to balance precision and recall across different confidence thresholds.In this case, an F1-score of 0.95 was achieved when the confidence threshold was set at 0.568.This score indicates a suitable balance between the model's precision and recall, suggesting significant capability in accurately identifying positive instances while keeping a low number of false detections, even under moderate confidence conditions.Table 6 displays the results of training validation through a confusion matrix.As evidenced, 87% of true instances of crosswalk A were correctly classified.Similarly, 100% of true instances of crosswalk B were classified accurately.

True Crosswalk A True Crosswalk B
Predicted crosswalk A 0.87 Predicted crosswalk B 1.00

DTM and Obstacles
With the employed methodology, a total area of 30,388.93m 2 was obtained as a DTM for pedestrian traffic, including sidewalks and pedestrian crossings, representing 37.81% of the analyzed urban area.Additionally, from the point cloud, an area of 7655.79 m 2 was classified as obstacles for pedestrians, accounting for 25.19% of the traversable surface.Among these obstacles, 7472.49m 2 was identified as fixed obstacles, and 183.30m 2 was pedestrians considered obstacles due to having an HAG between 25 and 220 cm.Table 7 shows the number of obstacles detected on each analyzed street.Figure 8 provides a detailed view of the detected and geolocated obstacles.The grid cells comprising these obstacles have an impedance value of T = 1000 s, as walking over them is impossible.In this image, the boundaries between sidewalk-roadway and areas near buildings and walls, also classified as obstacles, can be distinguished.Moreover, multiple permanent obstacles and pedestrians are identified.

Pedestrian Segmentation
The results of our ablation studies on pedestrian segmentation can be seen in Table 8.Interestingly, the combination of raw floor distances, smoothed intensity, scan angle, and floor distances with no geometric features yields the best results, even better than considering all the mined features.The reason might be that the extra information encoded through the geometric features interferes with the information available through height, spectral, and scanning features, making the model fit worse to the data.Therefore, models dealing with pedestrian segmentation in 3D point clouds might save time and money by computing only height and smoothed features, provided that intensity and scan angle are available.Concerning the hyperparameter tuning experiments detailed in Table 9, we hold that using 90 decision trees with a maximum depth of 25 each and balanced class weights leads to the best model.This model configuration is powerful enough to maintain high accuracy when considering the final training dataset.Moreover, it has a low standard deviation, so it is expected to yield similar results despite any randomness involved in the model training (e.g., random data splits).On top of that, it takes less than 10 min to train on the final dataset, which means it is well suited to run more experiments in less time than models using more decision trees (those take between 900 and 1900 s).While it is close in scores and standard deviation to its uniform counterpart, having a model that works with balanced class weights should be preferred because it accounts for the data imbalance inherent to the problem.Note that pedestrian segmentation implies having many objects represented in the point cloud, where only pedestrians must be labeled as positives, which unavoidably leads to a non-uniform class distribution.
The ablation experiments in Table 8 and the hyperparameter tuning in Table 9 were carried out independently.Thus, the hyperparameter tuning experiments were conducted considering all the features.Combining the results from both experiments, we have a random forest model with 90 decision trees of maximum depth 25 and balanced class weights trained considering 14 features per point.Under these circumstances, the training time is around 60.72 s, and the evaluation metrics on the final training dataset are 99.94% accuracy, 99.76% precision, 99.93% recall, 99.85% F1-score, and 99.70% MCC.Finally, a visual inspection of the results is available in Figure 9, which shows the model's classification directly in the 3D point clouds.

Better-Walkability Areas
It was possible to differentiate 16,443.88m 2 of less suitable surface for pedestrian traffic due to its proximity to obstacles compared to 13,945.05m 2 of surface completely free of obstacles, resulting in a 54.11% reduction in the area available for pedestrian traffic.Figure 10 shows two details of the better-walkability areas.Image 'A' displays a detail of the surface generated solely from permanent obstacles, while in image 'B', a detail of the surface incorporating both permanent obstacles and pedestrians is shown.These images illustrate the different temporal penalties for proximity to obstacles.Additionally, in image 'B', a trend is observed where pedestrians tend to walk in the center of available passage areas, indicating that the most common behavior is to walk while trying to keep as far away as possible from existing obstacles.

Pedestrian Accessibility Analysis: The Creation of Cumulative Pedestrian Travel Cost Surfaces and Model Validation Testing
Pedestrian routes were calculated from 30 student residences near the analyzed school.The routes were analyzed for each residence based on the four situations indicated in Table 5. Figure 11 displays images and details of each of the four cumulative cost surfaces for the same area.As observed, each test has generated a cumulative cost surface and different least-cost path trajectories, confirming that the least-cost paths have identified cells with the lowest unit displacement cost and have adapted to the available space between obstacles, utilizing areas furthest from them, as easily observed in images G, H, I, and J.  Table 10 compares the accumulated displacement time from each residence to the educational center.As expected, the lowest times were found in all cases in test 1, where there were no pedestrian movement restrictions.Conversely, the longest travel times occurred in tests 4 and 5, where paths farthest from both obstacles and pedestrians were prioritized and penalized with higher travel time values for those closer.The average maximum time difference between tests 1 and 4 for all studied residences was 3.88%, and in the case of test 5 compared to test 1, it was 5.40%.These time differences are significant, indicating the importance of filtering pedestrians prior to pedestrian mobility studies, as performed in this work.The maximum difference values obtained between tests 4 and 1 were found at residences 25, 30, and 24, with delay percentages of 15.03%, 8.39%, and 7.92%, respectively, confirming that the excessive presence of obstacles and the need to seek pedestrian paths away from them greatly influences pedestrian travel efficiency.Finally, Figure 12 displays the least-cost paths obtained in each test, utilizing the previously generated cumulative cost surfaces.It is evident that the trajectories varied significantly to avoid existing obstacles, but it can also be observed that some paths, such as 2, 14, or 25, have drastically altered their trajectory, even switching sidewalks, to gain more walkable space, despite having to travel a section opposite to the destination and a longer distance, validating the temporal results shown in Table 10.However, paths in the NE area (19, 7, and 20) barely vary their trajectory between different tests due to the greater pedestrian space available, as they traverse areas designated for pedestrians for much of their route.

Discussion
The work carried out in this research enabled the creation of an effective method for understanding pedestrian mobility in urban areas.It accurately calculated cumulative displacement cost surfaces and identified least-cost paths from any location within the study area to a destination point.
The proposed method differs from others, mentioned in Section 1, because it utilizes MLS point clouds as one of the initial data sources, allowing for a very extensive analysis area to be covered in a short time and obtaining high-quality information, both geometrically and radiometrically.Acquiring this type of data is costly, and processing these point clouds to extract relevant information for accessibility analysis requires complex operations.However, the result obtained was a study on raster surfaces of very high spatial resolution, which provided street geometry with a level of detail impossible to achieve with any other starting data.This information allowed for the generation of displacement cost surfaces with a GSD of 10 cm/pixel, producing street information practically at the level of individual tiles, enabling clear differentiation of potential obstacles or areas with greater difficulty for walking.
Similarly, the pedestrian crossing segmenter, based on computer vision methods, demonstrated effective performance, successfully detecting all existing pedestrian crossings in the study area without any false positives and achieving precision-recall and F1-confidence values similar to recent studies specifically dedicated to pedestrian crossing detection using computer vision such as Kaya et al. (2023) [36].It was possible to adequately segment all pedestrian crossings in the study area, accurately delineating available passage areas and without losing spaces for potential pedestrian routes.The use of this technique significantly improves results compared to other methods like Luaces et al. (2020) [37], as the segmentation of each pedestrian crossing is much more precise, based on pixel identification rather than following geometric patterns.Clearly, as the segmenter is fed with new images of pedestrian crossings with similar characteristics, it will gain precision, resulting in an increasingly accurate model over time.
Concerning pedestrian segmentation in the 3D point clouds, we found that a simple combination of height features and point-wise smoothing filters on the intensity, scan angle, and height features of a neighborhood can be enough to detect pedestrians successfully.By including the execution time in our hyperparameter tuning experiments, we were able to design a more sustainable machine learning model that trains on a big dataset in 74% less time than four-times-bigger ensembles yet provides similar accuracy.Despite the validation paradox inherent to the active learning paradigm, we think that our model has shown good generalization capabilities due to the high mean accuracy (99.89%) and low accuracy variance (0.003%) measured in our experiments.However, even if we needed to train the model on two or three times more data to improve its generalization, we could do it with a few hours of human work thanks to the entropy-guided labeling method.
It is evident that obstacles impede pedestrian movement, but it is also important to consider the free space available for walking between them.This is another difference to highlight in our research compared to the current state of the art, as importance has been given not only to the existence and location of obstacles on sidewalks but also to the available passage spaces between them.In this regard, we have assigned a unit displacement cost value of 1000 s to cells belonging to obstacles.This decision was based on the fact that walking over these cells is unfeasible, and therefore, least-cost path trajectories must avoid them.Similarly, we assigned the same penalty to passage spaces narrower than 50 cm relative to obstacles, building facades, and curb edges, as these are uncomfortable and even hazardous pedestrian-traffic areas.The penalty values gradually decreased from spaces larger than 80 cm, where pedestrians can walk more easily between them.
As demonstrated in the conducted tests, both the accumulated displacement cost values and the least-cost paths varied significantly when these parameters were considered (Table 10), increasing travel time by up to 15.03% in some cases.It is important to highlight the influence of the lower parts of tree canopies, which, with a height of less than 220 cm, also represent significant pedestrian obstacles, causing pedestrians to avoid walking in these areas, as observed in images A and B of Figure 10.Similarly, narrow streets also pose an impediment to pedestrians, who seek to avoid obstacles and even choose longer routes to circumvent narrow passage areas, as seen in least-cost paths 2, 14, and 25 of Figure 12, which alter their trajectories to avoid streets with narrow passage areas.Conversely, the least-cost paths originating from residences 19, 7, and 20 hardly vary their trajectories during all tests.This is because most of their routes pass through pedestrian streets and parks, with areas specifically designated for pedestrians, confirming that such areas significantly improve urban pedestrian mobility.Therefore, there are increasing research efforts and initiatives focused on redistributing urban public space to promote pedestrianization (Mendzina et al., 2020; Urban Design or The City of Children by Francesco Tonucci) [38][39][40].

Conclusions
Once again, MLS data has proven to be a valuable data source for conducting geomatic studies, particularly in this case, for pedestrian mobility analysis in a purely urban area of the city of A Coruña.Leveraging high-density and highly accurate point clouds has enabled us to conduct a mobility study with tremendous precision, using raster format surfaces with a GSD of 10 cm/pixel.This allowed for clearly identifying obstacles and other impediments to pedestrian movement freedom.Similarly, using computer vision techniques through AI has enabled us to obtain a surface composed of sidewalk areas and pedestrian crossings, precisely delineating pedestrian freedom areas and better-walkability areas between existing obstacles.The combined analysis of all these factors resulted in the generation of multiple cumulative displacement cost surfaces, which facilitated the quantification of the time required to walk from any point in the study area to a destination point, as well as identifying the optimal path for pedestrian movement.
The results obtained once again demonstrate that pedestrian-specific areas and wider streets with open spaces significantly improve pedestrian mobility in cities.In this case, it has been quantified that, for the study area in this work, travel times vary on average by 3.88% and represent up to a 15.03% delay in areas with a high presence of obstacles or narrow passage spaces.
The main limitation of this study lies in the high cost associated with acquiring MLS point cloud data and processing these point clouds to extract relevant information for accessibility analysis.Additionally, the time and complexity required to perform these operations are significant.This could limit the replicability of the study in other cities that may not have the financial or technical resources required to carry out a similar data acquisition and processing process.
However, regarding replicability in other cities, the approach and methodology used in this study could be adapted and applied in different urban contexts.The use of MLS point clouds as an initial data source for pedestrian mobility analysis, along with AIbased computer vision techniques, provides a robust framework for conducting similar studies in other urban areas.Although specific resources may vary from city to city, the general approach of utilizing high-quality data and advanced analysis techniques can be successfully applied in different urban environments to understand and improve pedestrian mobility.In addition, our proposed methodology allows us to get more out of those cities that already have LiDAR data of their streets, usually to make inventories of urban elements.
As future lines of research, it is intended to extend this work to broader urban environments within the city.It also opens up another associated line of research, which is to conduct a specific mobility study for wheelchair users or people with reduced mobility due to age, physical impairments, or specific circumstances (e.g., walking with small children, pushing a shopping cart, etc.).In this case, obstacle restrictions should be extended to a height above ground (HAG) of 5 cm or even lower, considering the difficulty in overcoming these height differences.Similarly, there is the possibility for other new studies, such as creating an application capable of identifying and managing school routes based on the knowledge of the location of students' residences for each school and the application of the model described in this study.

Figure 1 .
Figure 1.Study area and distribution of the 10 streets analyzed.The green polygon shows the total study area (173,153.40m 2 ).The yellow polygon shows the streets studied (80,363.06m 2 ) and the black lines are the road axes of the streets studied.Source: self-made.

Figure 2 .
Figure 2. Phases used to generate the pedestrian accessibility model in urban areas.Source: self-made.

Figure 3 .
Figure 3. Examples of the two types of labeled pedestrian crosswalks.The violet polygon means crosswalk type A and the red polygon means crosswalk type B. Source: self-made.
Further information of these parameters is provided in Quintana et al. (2016), Arya et al. (2020), Pham et al. (

Figure 6 .
Figure 6.Segmented crosswalks, detail.Red pixels show segmented crosswalks and green bounding boxes show crosswalk ground truth.Source: self-made.

Figure 7 .
Figure 7. Left image: mask precision-recall curve for object segmentation.Class A identification value: 0.904, class B: 0.995, and for all classes: 0.949, with an average precision (mAP) of 0.5.Right image: F1-confidence curve, showing an identification value for all classes of 0.95 at a confidence threshold of 0.568.Source: self-made.

Figure 8 .
Figure 8. Detail of obstacle surface.Examples of some of the obstacles that have been detected are streetlights and trees.Red pixels show permanent obstacles and blue pixels show pedestrians.Source: self-made.

Figure 9 .
Figure 9.The segmented pedestrians visualized in the 3D point cloud.The purple color means the point is not classified as a pedestrian; yellow means it is.Source: self-made.

Figure 10 .
Figure 10.Detail of obstacle friction surface.Image (A) shows better-walkability area surface only including obstacles.Image (B) shows better-walkability area surface including obstacles and pedestrians.Source: self-made.

Figure 11 .
Figure 11.Figures (A,C,E,G,I) show the least-cost paths over cumulative cost surfaces.Figures (B,D,F,H,J) show the least-cost paths over friction surfaces and obstacles.Figures (A,B): result test 1, considering only the slope surface.Figures (C,D): result test 2, considering permanent obstacles.Figures (E,F): result test 3, considering permanent obstacles and pedestrians.Figures (G,H): result test 4, considering permanent obstacles and better walking areas around them.Figures (I,J): result test 5, considering permanent obstacles, pedestrians, and better walking areas around them.Source: self-made.
Figure 11.Figures (A,C,E,G,I) show the least-cost paths over cumulative cost surfaces.Figures (B,D,F,H,J) show the least-cost paths over friction surfaces and obstacles.Figures (A,B): result test 1, considering only the slope surface.Figures (C,D): result test 2, considering permanent obstacles.Figures (E,F): result test 3, considering permanent obstacles and pedestrians.Figures (G,H): result test 4, considering permanent obstacles and better walking areas around them.Figures (I,J): result test 5, considering permanent obstacles, pedestrians, and better walking areas around them.Source: self-made.

Figure 12 .
Figure 12.Least-cost paths resulting from tests 1 to 4. The numbers indicate the IDs of the 30 residences analyzed as origin points.In image (d), the areas with the most significant changes in the routes are indicated by arrows and polygons.Source: self-made.

Table 1 .
Characterization of the streets that make up the study area.Avg St Width is the average value of the width of the street; Avg SW is the average value of the width of the two sidewalks of the street; the average width of a single sidewalk is half of this value; Avg SW (%) indicates the sidewalk vs. street ratio as a percentage; Zmin and Zmax indicate the minimum and maximum altitude values of the street; Avg slope is the average slope of the street; Average indicates the average values for the set of all streets.

Table 2 .
Metrics of the dataset generated through Roboflow.

Table 3 .
Values of hyperparameters used to train YOLOv8 model.

Table 4 .
Penalty times are based on distance to obstacles.It is considered in this analysis that areas closest to obstacles are part of them to ensure moving away from them during displacements and seeking more comfortable pedestrian routes.

•
Test 1: A study applying only a friction surface based on terrain slope values.• Test 2: In addition to a friction surface based on slopes, another surface based on obstacles derived from the point cloud analysis with HAG values between 25 and 220 cm was utilized.Spaces with a passage width of less than 50 cm from building walls and sidewalk curbs were considered obstacles.Cells containing these elements were assigned a unit displacement cost of 1000 s.
• Test 3: To the two previously mentioned friction surfaces, another one representing pedestrians was added, considering them as obstacles as well.A unit displacement cost of 1000 s per cell was assigned to them.• Test 4: To the friction surface of test 2, another one of better-walkability areas concerning obstacles was added, with the temporal penalties described in Table 4. • Test 5:

Table 5 .
Summary of the friction surfaces used in each of the pedestrian accessibility tests conducted.

Table 7 .
Measurement of obstacle surfaces identified by street.St A, SW A, Obst A, and Ped A show areas of total streets, sidewalks, obstacles, and pedestrians.Obst/SW represents the ratio of obstacles to sidewalk, and Ped/Obst shows the ratio of pedestrians to obstacles and pedestrians (Ped A/(Ped A + Obst A) × 100).Average and Total show the average and total surface values for all analyzed streets.

Table 8 .
The precision, recall, F1-score, and Matthews correlation coefficient (MCC) for the different configurations defining the ablation studies.In this table, r ≥ x means spherical neighborhoods with radii greater than or equal to x.The best result is represented in bold text.

Table 9 .
The results from the grid-search-based hyperparameter tuning.The initial accuracies were measured on the initial budget (i.e., the training dataset at the first iteration) and the final accuracies on the final budget (i.e., the training dataset at the final iteration).The mean execution time of training the model is measured considering the dataset at the final iteration because it contains the most samples.The selected model after the hyperparameter analysis is represented in bold.

Table 10 .
Tests 1 to 5 show the travel time walking from each home to the educational center.MD = maximum difference, shows the maximum time difference, in seconds and percentage, of the times obtained in each test 4 vs. test 1 and also test 5 vs. test 1.Average shows the average values obtained in all tests, for the 30 cases analyzed.