A Geometric Classiﬁcation of World Urban Road Networks

: This article presents a method to uncover universal patterns and similarities in the urban road networks of the 80 most populated cities in the world. To that end, we used degree distribution, link length distribution, and intersection angle distribution as topological and geometric properties of road networks. Moreover, we used ISOMAP, a nonlinear dimension reduction technique, to better express variations across cities, and we used K-means to cluster cities. Overall, we uncovered one universal pattern between the number of nodes and links across all cities and identiﬁed ﬁve classes of cities. Gridiron Cities tend to have many 90 ◦ angles. Long Link Cities have a disproportionately high number of long links and include mostly Chinese cities that developed towards the end of the 20th century. Organic Cities tend to have short links and more non-90 and 180 ◦ angles; they also include relatively more historical cities. Hybrid Cities tend to have both short and long links; they include cities that evolved both historically and recently. Finally, Mixed Cities exhibit features from all other classes. These ﬁndings can help transport planners and policymakers identify peer cities that share similar characteristics and use their characteristics to craft tailored transport policies.


Introduction
Throughout history, cities have adapted to address the challenges they face [1]. In the early twenty-first century, urbanization, climate change, and resources constraints are some of the stressors that are forcing cities worldwide to reshape their urban infrastructure, often leveraging advances in technology [2]. Among the many urban infrastructure systems, transport infrastructure performs a critical role for a city to function properly. In many cities, transport infrastructure (both the physical infrastructure and infrastructure services) that has evolved sometimes over centuries or millennia now need to accommodate new urban travel dynamics and to address many challenges including traffic congestion and pollution related to transport activities. For example, the proliferation of ride hailing and micro mobility apps such as Lyft, Uber, Grab, and dockless scooter-share make it easier to travel to places that are not serviced by public transport. Further, micro mobility and ridesharing options (i.e., rides shared by multiple people) can reduce the cost of travel, making it attractive for short distance travel [3]. More generally, smart city initiatives are being widely deployed to address transport challenges. Smart city initiatives consist of policies aimed at managing and/or solving challenges, usually in a sustainable manner.
The core element of a smart city is leveraging technology, often thanks to the proliferation of large amounts of data that have become available, colloquially referred to as Big Data. In addition to improving how a transport system is operated, the influx of Big Data also provides an opportunity to better understand the current state of transport systems, notably through the definition and measure of indicators that characterize them. Indeed, every city has evolved over different timeframes and different time periods to address different challenges. Moreover, while some cities have grown organically (in the absence of a central force guiding urban network design) [4], others have been deliberately planned [5]. For example, a compendium of laws called "Laws of Indies" were used by Spanish settlers in designing their urban settlement [6]. During these processes, organic and deliberate planning or a mix of both, cities evolved with certain features unique to each. Gaining an understanding of the properties of transport systems can provide us with the ability to craft more effective solutions to address contemporary challenges. This study offers a method that leverages the availability of open-source maps and software to uncover the existence of universal patterns and similarities in urban road networks.
In this article, we focus on the spatial structure of transport road networks because their physical structure can play an important role on their overall performance [7] and in how cities are shaped [8,9]. Moreover, road networks have been studied to understand travel behavior [10][11][12], travel dynamics [13,14], public health [15,16], accessibility [17][18][19], and resilience [20][21][22][23] to name a few. More broadly, the physical structure of urban road networks can influence the performance in many different aspects of a city [24]. For instance, interruptions to any road links causes a connectivity loss; however, the degree of damage to the network varies depending on the properties of the interrupted links. In addition, road network properties can be used to identify the vulnerabilities in the networks, and to plan or improve their resilience [25][26][27]. Furthermore, characterizing and studying road networks across different cities can provide insights to further our understanding of urbanization and transport planning [28].
The main goal of this article is to propose a method that combines both topological and geometric information together to study and classify road network designs. The method was applied to the 80 most populated cities in the world. In terms of topological property, we used the degree distribution of the network. In terms of geometric property, we used both the distribution of link length and the distribution of street angles since they capture different properties of road networks. Combining these three pieces of information, we first sought the presence of universal patterns that exist in all networks. Second, we characterized road networks by discretizing the distribution of their properties, resulting in every urban road network being expressed by a single vector of variables. After applying ISOMAP (a dimensionality reduction technique), a clustering technique was employed to identify different classes of cities. Finally, we discuss the results and detail the characteristics of the classes found.

Network Science
Network Science is a discipline devoted to the study of networks [29,30] that has been heavily applied to study systems of all kinds [31,32], and in particular complex systems such as road systems [13,24,[33][34][35][36][37][38][39][40][41][42][43]. Urban road networks are generally modelled as spatial and planar networks [44]. A spatial network is a network embedded within a 2D or 3D space and characterized by Euclidean geometry [45]. There has been a growing interest in studying urban roads as complex networks, partly fueled by advances in geographic information systems (GISs) and other new sources of information [44][45][46]. Past studies have shown, on a coarse grained level, the existence of similarities among different cities [44]. Characterizing urban road networks allows us to compare different cities and sometimes gain insights into their evolution and functional properties [47,48]. For example, considering road systems as networks, properties such as topological patterns, network efficiency, and network robustness can be studied and compared across cities. To find an accurate classification of cities based on road network properties, both topological and geometric properties should be considered [48]. Most past studies have studied single measures or a combination of multiple measures across different cities. For example, Boeing [49] studied the circuity of 40 different cities across the United States (US) and found that for most cities walking routes were less circuitous than driving routes. Similarly, Boeing [36] examined the network Urban Sci. 2022, 6, 11 3 of 15 orientation and entropy of 100 cities; specifically, the study considered network's bearings, entropy, and circuity to classify cities. The study used hierarchical clustering technique to classify cities and found three high-level and eight intermediate-level clusters.
In this article, each road system is represented as a network composed of nodes and links [50]. The nodes are the intersections formed by road segments and dead ends, and the links are the road segments themselves that connect the nodes. To quantitatively characterize the topological properties of road networks, various measures have been used in the literature, including network centrality measures such as betweenness centrality [13,27,34,37,51,52], closeness centrality [13,50,[53][54][55], degree centrality [13,45,[56][57][58][59] and clustering coefficient [35]. Each measure can be used to capture a certain property of the road networks. For example, betweenness centrality captures the ability to provide a path between regions in a network, and the betweenness centrality of a node is defined as the proportion of the network's shortest paths going through the node [60]. Additional measures have been proposed and used by researchers. For instance, a new measure based on the block's shape called "shape factor" was employed in [48] for the purpose of classifying cities. In this study, we used degree distribution (defined later) as the topological property of the network.
Beyond their topological properties, road systems also possess geometric properties. The most common geometric measure of road systems is link length [56]. For instance, Jiang [56] studied the transport networks of 40 cities in the US and concluded that the networks are not random and that they exhibit both small-world and scale-free properties. This conclusion was based on analyzing the degree distribution, path length, and clustering coefficient of the network of each city. Buhl et al. [47] considered both the geometric and topological elements of a road network to study its robustness and efficiency; in their study, the geometric element of the network considered was shortest path length. Albeit less common, angles created by intersections are another geometric measures used to study the geometry of road networks [24,61,62].
This study proposes a novel method that utilizes both topological and geometric properties of road networks to cluster cities with similar characteristics.

Data
This study utilizes the public data made available by Karduni et al. [63], who developed the open-sourced tool GIS Features 2 Edgelist (GISF2E) that converts any shapefile into a network. Further, Karduni et al. [63] applied the code to the road networks of 80 of the world's most populated cities from OpenStreetMap (see a map in the Results section).
The extracted information includes the nodes and their geographical locations, and the length of each road segment (i.e., link length). The locations are in the form of Universal Transverse Mercator (UTM) coordinates and the link lengths are in meters. Additionally, information includes which nodes are connected by each link in the network. For example, if a network has a total of 100 links, the data will have 100 observations with each observation showing two nodes that are connected by that link, the nodes' latitude and longitude coordinates, and the link's length in meters. This information is enough to create a network.

Methodology
This study considers two main aspects of road systems: network topology and network geometry. Network topology captures the relationships between nodes through their links regardless of scale, as such it is mainly concerned with network connectivity. In contrast, network geometry is defined as the shape and magnitude of the connection between the nodes. One way to express network topology is by degree distribution. The degree distribution of an urban road network represents the distribution of its connections across the nodes. In this study, we only considered degree distribution as a measure of topological information since it is computationally easier to calculate (in contrast to other measures such as the distribution of betweenness centrality that tend to be highly correlated to Urban Sci. 2022, 6, 11 4 of 15 degree distribution). To capture the network's geometry, we considered link length and intersection angle. Link length is defined as the distance between two intersections (nodes). We used the intersection angle to identify whether the streets follow a planned pattern (e.g., a grid) or a more organic pattern. The three distributions used in this study are defined in this section as follows.

Degree Distribution
Firstly, in the context of urban road networks, a graph (G) is defined as a set of nodes or intersections N connected by links or road segments (L). Consider a graph G with N nodes and L Links, the degree distribution P(k) is the proportion of nodes with k connections: where A ij is the adjacency matrix, in which the rows and columns represent the nodes in the network. The element (i, j) is 1 if the node i is connected to j, or 0 if they are not connected. Then, the degree distribution P(k) is defined as: where, N(k) is the total number of nodes with k connections and N is the total number of nodes in the network. From all 80 cities, the variation in the distribution of degrees ranged from 1 (i.e., dead end) to 16. Nonetheless, 99.7% of the nodes have five or fewer connections.

Link Length Distribution
The link length distribution, P(l), represents the probability that a road segment has a length l. Link length distribution, however, is a continuous distribution, whereas we seek discrete values for clustering. A novelty of this work is that we fix different length categories (i.e., threshold values) based on the percentiles and the maximum value of the link length data of all 80 cities combined. Specifically, we considered the percentile categories of 10 to 90 with increments of 10. Once threshold percentile values from the 80 cities combined were found, we could construct an empirical link length distribution for every city individually by counting how many links belong to each bin. The bin limits are: length less than the 10th percentile from the 80 cities combined, between the 10th and 20th percentile, . . . , greater than the 90th percentile. As a result, the link length distribution of each city is turned into a discrete distribution of 10 values. Table 1 shows the threshold percentiles measured for link length distribution from the 80 cities combined.

Intersection Angle Distribution
The street's intersection angle is the angle created when two road segments intersect. We calculated this angle information for each node that is connected to other nodes. For example, as shown in Figure 1, if a node i is directly connected to four other nodes, j 1 to j 4 , then it creates four angles that sum to 360 • . For a given node i and its connected nodes j, each angle is calculated for node i and two nodes, e.g., j 1 and j 2 , that are adjacent to each other. As an example, consider the triangle ∠j 1 ij 2 to calculate the angle A shown in Figure 1. Since the information of latitudes and longitudes of the nodes i, j 1 , and j 2 , and link lengths of i → j 1 and i → j 2 are known, we can calculate the distance between j 1 → j 2 using the Haversine formula [64]. The angle is then calculated using the Cosine rule: where a, b, c are the sides of the triangle, and A is the angle made by sides b and c. We repeated this procedure for all nodes. Similar to the link length distribution, the continuous distribution of street angles was discretized. Instead of using percentiles, a 20 • bin size was selected, i.e., 0 to 20 • , 20 to 40 • , and so on, providing us probabilities of observing angles specific to each bin. For example, the probability of observing street angle between 80 and 100 • in a city is, say 0.27. Overall, we have a vector with 18 values for each city, since there are 18 bins for 0 to 360 • limit. Finally, each city was expressed by a vector with 44 values (16 for degrees, 10 for link lengths, and 18 for street angles).

Clustering
We employed unsupervised machine learning methods to cluster cities based on their degree, link length, and street angle distributions information. Before that, however, we noted that the clustering entities based on 44 values (i.e., length of the vector measured for each city) representing three features of geometric and topological design is not necessarily meaningful. Because most clustering techniques utilize Euclidean distance measures to identify clusters, when the data dimension increases, the distance between data points in high dimensional space becomes more or less uniform [65]. Therefore, to better express the differences between cities, it is often preferable to reduce the dimensionality of the vector used while preserving the inherent features of the road networks. For this, we first transformed the raw data using ISOMAP [66], which is a non-linear dimensionality reduction technique. ISOMAP was selected thanks to its ability to preserve the variation in the original data in the transformed data. Additionally, since the data are probability distributions, any nonlinear behavior would not be captured by linear dimensionality reduction techniques such as Principal Component Analysis (PCA). Finally, each city was expressed by a vector with 44 values (16 for degrees, 10 for link lengths, and 18 for street angles).

Clustering
We employed unsupervised machine learning methods to cluster cities based on The ISOMAP algorithm finds the data points that are closer to each other (neighbors) based on the distance d(i, j) between data points i, j. Two methods are usually used to find the neighbors: (1) k nearest neighbors (knn) or (2) a fixed radius. Both require a parameter defined by the user; knn requires the number of data points as neighbors (5 was selected), and fixed radius requires the selection of a pre-determined radius ε within which all nodes will be considered as neighbors. Then, it forms a graph H where all data points that are identified as neighbors are connected to each other. The data points in the graph H are a subset of the whole data; the subset space is considered to be Euclidean. In the second step, the algorithm calculates the shortest path distance d H (i, j) of graph H between all pairs of data points. The idea is to break down the whole dataset into multiple subsets so that the geodesic distance inherent to the data projected onto an unknown manifold space is preserved. In the algorithm's context, the geodesic distance is defined as the shortest distance between two points along the manifold space which is locally Euclidean. For example, let us consider the whole dataset has been broken down into 20 subsets (graphs), as such H = {1, 2, 3, . . . 20}. The shortest path between two points, say, one in subset 1 and another in subset 20, would be the summation of all shortest paths observed in subsets H. This yields a square matrix D H = {d H (i, j)}. Finally, the algorithm constructs a low dimensional Euclidean space (Y), for instance a 2-D space, to embed the data that preserve the distance D H of the original data. For each point i, the coordinates c i , which is a vector consisting of (x i , y i ) in space Y (2D), were chosen in such a way to minimize the following cost function [66]: where D Y = d y (i, j) , is the Euclidean distance matrix between all pairs of data points in space Y, L 2 is the norm of the matrix, and τ is the operator that converts the distances to inner products for efficient optimization. Using ISOMAP, raw data that consist of 44 vectors can be transformed to two dimensional vectors.
We then compared the performances of several clustering techniques on the resulting two-dimensional vector, including K-means [67], spectral clustering [68], hierarchical clustering [69], and HDBSCAN [70], and assessed them based on their silhouette score [71] (see Table 2). All these clustering algorithms work based on the Euclidean distance between data points. Based on the silhouette scores calculated (presented later in the results section) for different clustering techniques, K-means clustering was selected as the preferred method. Kmeans clustering is an iterative procedure that essentially minimizes the distance between a cluster's centroid to its data points. The centroid of a cluster is simply a coordinate of the center of a cluster.
First, given k clusters with centroids C k = {c 1 , c 2 , . . . c k }, for each cluster i ∈ k, a centroid c i ∈ C is arbitrarily chosen, and the data points x n closer to the centroid c i are assigned to the cluster i based on the minimization of the function, ∅ = ∑ n x=1 min c∈C x − c 2 . After the end of the first iteration, the centroids are updated by taking the mean of all data points in each cluster; C i as: The procedure was then repeated assigning data points to the closest cluster. The centroids were then updated again, and so on, until all data points were assigned to a fixed cluster.

Universal Pattern
A preliminary analysis reveals that the nodes and the links for all the 80 cities possess a linear relationship of the form, L = 1.33 · N + 2907. Figure 2 shows the linear fit. We found the R 2 value of 0.99 for this fit. This result suggests the presence of a universal mechanism that directs the road network growth. In other words, this relationship shows that on average each node is connected to 1.33 links, and this trend is found to be universally true for all cities regardless of when the road network was developed.
i. 2022, 6, x FOR PEER REVIEW Moreover, unlike many urban properties that follow subl ing laws [72], this value stays constant regardless of the siz words, a large network follows the same pattern as a small spatial feature of road systems. Figure 4 shows the three distributions for five represent (discussed later). Starting with degree distribution, the most cities is 3; i.e., a T-shaped intersection. This observation is in l Lee and Jung [73]. The authors analyzed the streets patterns in that streets with three connections were more frequent than str In our study, the only exception is Buenos Aires where man degree of 4, which is a manifestation of heavy deliberate planni pattern for streets. Moreover, unlike many urban properties that follow sublinear and super-linear scaling laws [72], this value stays constant regardless of the size of the network. In other words, a large network follows the same pattern as a small network, likely due to the spatial feature of road systems. Figure 3 shows the three distributions for five representative cities from each class (discussed later). Starting with degree distribution, the most frequent degree across all cities is 3; i.e., a T-shaped intersection. This observation is in line with results showed by Lee and Jung [73]. The authors analyzed the streets patterns in 22 Korean cities and found that streets with three connections were more frequent than streets with four connections. In our study, the only exception is Buenos Aires where many intersections also have a degree of 4, which is a manifestation of heavy deliberate planning, strictly following a grid pattern for streets.

Road Network Analysis
(discussed later). Starting with degree distribution, the most frequent degree across al cities is 3; i.e., a T-shaped intersection. This observation is in line with results showed by Lee and Jung [73]. The authors analyzed the streets patterns in 22 Korean cities and found that streets with three connections were more frequent than streets with four connections In our study, the only exception is Buenos Aires where many intersections also have a degree of 4, which is a manifestation of heavy deliberate planning, strictly following a grid pattern for streets.  The link length distributions show that some cities have more long links than short links, while other cities, in contrast, have more short links than long links. Looking at Buenos Aires, we observe more link lengths that are greater than the 50th percentile and less than the 80th percentile compared with other categories.
From the street intersection angle distribution, we found that all cities exhibit a similar pattern. The distributions are clearly bimodal with peaks occurring at angles of 90 and 180 • . Looking at the street angle distribution, again Buenos Aires has higher percentage of 90 • angles compared with other cities since it has more intersections of degree 4 (i.e., four 90 • angles) compared with all other cities that have more intersections of degree 3 (i.e., two 90 • angles and one 180 • angle). The distributions of all 80 cities studied are shown in the Supplementary Materials.
As described above, we tested several clustering algorithms and used the silhouette score to assess their performance (see Table 2). We adopted a two-step process. In the first step, we tested the performance of four clustering techniques, namely, K-Means, Spectral K-Means, Hierarchical, and HDBSCAN. The best clustering technique was picked based on the highest silhouette score. The analysis showed that the K-means clustering technique outperformed the other techniques; hence it was ultimately selected. In the second step, we used two measures to find the optimal number of clusters: the silhouette score and the sum of squared distance. Based on these two measures, we found the optimal number of clusters to be 5; the full results of the second step are provided in the Supplementary File.
The results of the K-means clustering technique is shown in Figure 4 where clusters are plotted using the resultant two vectors from ISOMAP. The list of cities per group is shown in Table 3. By looking into the properties of cities present in each cluster, we can establish a classification of world urban road networks.
number of clusters to be 5; the full results of the second step are provided in the mentary File.
The results of the K-means clustering technique is shown in Figure 5 where are plotted using the resultant two vectors from ISOMAP. The list of cities per g shown in Table 3. By looking into the properties of cities present in each cluster establish a classification of world urban road networks.

Methodology
This study considers two main aspects of road systems: network topology and network geometry. Network topology captures the relationships between nodes through their links regardless of scale, as such it is mainly concerned with network connectivity. In contrast, network geometry is defined as the shape and magnitude of the connection  Figure 5 shows a map of all cities colored coded based on their class. Cities in class 1 are referred to as Gridiron Cities since they are characterized by having comparatively more 90 • angles than cities in other classes. These kind of cities typically are the result of deliberate planning practices, aiming to achieve an "efficient flow of traffic" [73]. Furthermore, cities in class 1 tend to have as many nodes with one connection as nodes with two connections. Generally, the grid layout or orthogonal layout tend to have an intersection angle of 90 • and/or 180 • . For example, Buenos Aires, which is one of the deliberately planned cities, exhibits the grid layout, and has equal number of nodes with three and four connections. Grid layouts, in most cases, result in square or rectangular block shapes. This also aligns with the conclusion provided by Louf and Barthelemy [74] that Buenos Aires consists predominantly of square or rectangular blocks. Moreover, other Latin American cities, namely Lima and Santiago, are also found to belong to the Gridiron Cities class (perhaps as a result of the "Laws of Indies" mentioned above). Despite having evolved over millennia, Baghdad also belongs to class 1 since it evolved significantly in the 1960s, following modern planning practices; hence the rise to more "rigid rectilinear grid systems of roads, and highways" [75].

Cities Classification
Cities in class 2 consist predominantly of Chinese cities that have grown substantially at the turn of the 21st century and that are characterized by longer link lengths (i.e., large block size). They are referred to as Long Link Cities. In other words, link lengths greater than 90th percentile are more prevalent in these cities. This type of "leapfrog development" is common in the urban development of many Chinese cities [76]. Specifically, leapfrog development is characterized by developers skipping large areas to acquire cheaper lands further away, resulting in a form of urban sprawl. Overtime, this type of development practice generally causes polycentricity [77]. LOBsang et al. [55] studied 31 Chinese cities to understand the relationship between their street network properties and economic development. In their study, the authors identified that many cities with longer street length were polycentric. While this study focuses purely on Chinese cities, a similar study by Wang et al. [78] compared the Chinese city of Xiamen with Washington DC and San Jose, and found that the average road segment length of Xiamen is greater than the other two American cities. In our study, we also found most of the Chinese cities have more nodes with two connections than nodes with one connection.
Cities in class 3 are mainly comprised of radial cities that have developed over many centuries, often organically. They are referred to as Organic Cities. Unlike Gridiron Cities, these cities did not evolve by continuous directed planning. Instead, they evolved by adapting to local circumstances such as wars throughout history. We found cities in this class were characterized by shorter link lengths and a higher proportion of irregular street patterns (i.e., street angles that are not 90 or 180 • ). All the cities in this class are European cities, echoing the results of other studies. For instance, Strano et al. [50] studied 10 European cities and showed that they share common structural geometric properties. The authors also showed that all those 10 cities had shorter link lengths. Liu and Jiang [79] compared three European cities with three North American cities and showed the European cities have more short street segments and irregular street patterns. Further, Kaoru et al. [80] studied the road network of 30 European cities and identified similarities in spatial distribution of road segments at a larger scale (more than 1 km radius).
Cities in class 4 are characterized by a uniform distribution of link lengths and a higher proportion of 90 • angles. They are referred to as Hybrid Cities. Many of the cities present in class 4 have evolved both historically and recently. The city of Chicago and the many municipalities around Chicago, for example, grew substantially in the second half of the 19th century, before the advent of the private automobile, hence the presence of many shorter links. As the region continued to grow in the 20th century, longer links were built to link these municipalities, eventually leading urban sprawl, hence the almost uniform distribution in link length. A similar observation was made by Lee and Jung [73] who found that the core area of the cities exhibit urban characteristics, whereas the outer areas exhibit rural characteristics. The authors define urban characteristics as the concentration of "density of the roads and intersections" in the central area, and less dense in the rural (outer) areas. Therefore, Hybrid Cities can be viewed as cities that have both structural properties of urban and rural streets.
Cities in class 5 contain a mix of characteristics from the other classes. Cities are the results of long term evolution and are influenced by multiple factors including social, economic, environmental, and landscapes [81]. The growing process is also dynamic with new roads being paved to access already established places such as city centers and spreads outwards [82]. Thus, in general many cities share similarities, particularly cities with similar social, economic, and landscape features. This fact is further exhibited in Figure 4 as cities in class 5 are physically located in the center of the figure (shown in green). Therefore, cities in class 5 have both some shorter links and some longer links. They also have a non-negligible number of street angles that are not 90 or 180 • . These cities are referred to as Mixed Cities as they possess mixed characteristics.

Research and Policy Implications
The implications of this study are that the proposed methodology can be helpful to both transport planners and policymakers in identifying structurally similar cities. The geometry and topology of road networks are known to impact network performance. Yet, despite the presence of correlations between network properties and performance, the underlying factors that contribute to it are underexplored [83]. Historically, designing road networks included altering existing networks or adding new roads without taking their impacts on the whole network into account [84]. Such alterations are generally carried out to achieve a particular outcome; for example, to improve traffic congestion or promote active transport. For instance, in the 1950s and 1960s, the private vehicle was often seen as a potential solution to many transport problems [85]. This idea was reflected in their design: longer links, grid layouts, and so on. Identifying cities that are structurally similar can allow transport planners to find cities in their cluster that have better performance.
Furthermore, the classes identified in this study can be used to study urban design principles that can be adopted/replicated by transport planners. For example, analyzing traffic accident patterns in each class may help transport planners identify important causal variables for transport safety analysis. Additionally, in active transport, cities such as Amsterdam, Copenhagen, and Lund are considered frontrunners in Europe [85]. Finding in which class these cities belong can inform transport planners on the role of road network properties in recommending active transport policies. Therefore, gaining an understanding of the characteristics of the road network of a city is important to be able to develop effective design practices and select appropriate policies to address urban challenges. In addition, the methodology presented in this study can help policymakers to identify already implemented smart city policies in different cities that share structural road network similarities.
Furthermore, many cities in low-and middle-income countries might not have the necessary resources to experiment in the pursuit of smart city development. As a result, cities can look toward their peer cities to learn insights that can make them "smarter" [86]. Further, cities often have urban problems that are similar to their peers and learning urban practices from their peers can improve the understanding of what works and what does not [87]. Hodgkinson [88] argues that cities turning towards their peers create opportunities to learn and "[share] insights, ideas, and solutions." However, it is rarely practiced partly because of the lack of "global inter-urban perspective" in smart city development [89]. The methodology proposed in this study offers a way to select structurally similar peer cities. Based on this work, we posit that smart transport policies implemented in cities of one class may not be as effective as in cities of another class. For instance, Guo et al. [90] found that the same smart city policy towards improving traffic congestion did not yield the same improvements across different cities in China. Although the authors stated the significance of road networks in analyzing traffic congestion, the road networks information was not included in their study. Moreover, the authors highlighted the lack of methodologies that can accurately classify cities based on road networks. We note that further investigation is required to assess the performance of cities in each class. Investigating this question and defining and assessing the performance of a specific road network structure is the goal of future work.

Conclusions
The geometric and topological characteristics of urban road networks can play an important role in shaping their performance and the travel behavior of city residents. Using geometric and topological data, this article classified the urban road networks of 80 world cities. Degree distribution was used as a measure of network topology, and link length distribution and street angle distribution were used as measures of network geometry. An initial analysis showed the presence of a universal pattern expressed by a clear linear relationship between the number of nodes and the number links across all cities present in the dataset. It further showed that an average 1.33 links was observed for every node in the network regardless of the size of the network. To successfully cluster urban road networks, the vector with 44 values measured for each city was transformed with ISOMAP to reduce the number of dimensions to 2 to meaningfully express the variation in the data and improve clustering performance. From the different clustering algorithms tested, K-means clustering performed best based on silhouette score. We found through further analysis that the optimal number of clusters was five, that is, we found five classes of cities.
Class 1 consists of Gridiron Cities that show a higher number of 90 • street angles. Class 2 consists mostly of Chinese cities. The cities have a higher percentage of longer link lengths than any other cities considered in this study and were named Long Links Cities. Cities in class 3, named Organic Cities, were comprised mostly of radial cities such as Paris, Rome, and Madrid; these cities have shorter links and many non-90 and 180 • angles. Class 4 cities, named Hybrid Cities, includes cities that have evolved both historically and recently; these cities tend to have a uniform distribution of link lengths and a higher proportion of 90 • angles. Cities in class 5 exhibit a mix of properties from all other classes. Cities in this class have both some shorter links and some longer links, and they also have a non-negligible number of street angles that are not 90 or 180 • , thus they are named Mixed Cities.
The findings of our study shows that there is a clear difference in the topological and geometric properties of road networks between classes of cities. Identifying these characteristics among cities can help transport planners to improve the performance of transport systems. Further, the findings may help policymakers in developing transport policies that are appropriate to cities based on the characteristic of their class. The impacts of road network characteristics, however, on the successful execution of transport policies, especially urban mobility policies, need further research. Future work should focus on studying the impacts of the network characteristics identified on the performance of a road system. Eventually, understanding the road network characteristics of a city can be helpful in developing transport policies and in improving the design of existing transport networks to meet contemporary challenges.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/urbansci6010011/s1, Figure S1: Degree distribution of all 80 Cities, Figure S2: Link Length distribution of all 80 Cities, Figure S3: Street intersection angle distribution of all 80 Cities, Figure S4: Silhouette score measure to find optimal number of clusters, Figure S5: Sum of squared distance measure to find optimal number of clusters.