Identiﬁcation Method for Optimal Urban Bus Corridor Location

: Locating urban bus corridors based on corridor characteristics can increase the transportation capacity, improve transportation e ﬃ ciency, and increase the attractiveness and commercial value of urban bus corridors. In this paper, we describe the comprehensive optimization of the urban bus corridor location and setting of bus lanes, while considering the aggregation e ﬀ ect of the corridor. First, we use a K- shortest path algorithm to generate a candidate set of bus corridors. Then, we analyze the inﬂuencing factors of the bus corridor. Following this, we take the minimum generalized cost and the maximum aggregation utility along the path as the objective function and design a bus corridor location identiﬁcation optimization model, considering arc capacity, plot ratio, corridor development, and time constraints. Finally, we examine the real-world example of the Beijing city and identify the location of the bus corridors in the morning and evening peak hours. The one-way tra ﬃ c of most of the roads identiﬁed as bus corridors was found to be greater than 6671 people / h. Thus, the location of the bus corridor and setting of bus lanes in the corridor are closely related to passenger ﬂow, and the method can provide scientiﬁc guidance for transportation and urban planning departments and facilitate collaboration between these departments.


Introduction
Increased urbanization has led to the continuous expansion of some large cities and an increase in the number of people and vehicles in these cities. This has led to problems such as increasing travel distance and travel costs within the city, and an increase in the urban congestion and environmental degradation. Therefore, priority should be given to the development of public transportation and the improvement of public transportation services for the sustainable development of transportation services.
However, in some large cities, the capacity of public transportation services, such as rails and ground buses, may not meet the travel needs of residents, especially in the morning and evening peak hours. A large number of long-distance trips causes road congestion and bus congestion. For example, in the Beijing rail transit road network in 2018, there were 10 sections with a maximum cross-section passenger flow of more than 40,000 people per hour, one section with a maximum cross-section passenger flow of 60,000 people, and 21 sections with a maximum section full load rate exceeding 100%; these parameters were observed in the morning and evening peak hours, and were found to significantly reduce the quality of the passengers' travel experience [1]. In some cities, ground buses have low transportation efficiency and the average speed of the whole journey of throughout the country as low as 10.3 km/h [2]. To increase the capacity of urban public transportation and improve the transportation efficiency of ground buses, a sustainable public transportation system should be established. A large number of studies have shown that public transportation development and land use should be coordinated from the urban level [3,4], through mutual support and comprehensive utilization between the two [5][6][7] to achieve a reasonable distribution of vehicles on roads, improve travel efficiency, and reduce urban pollution and noise [8].
The urban bus corridor is both a platform where urban bus passengers gather and a center for urban economic activities and urban functions. Bus corridors often have dense passenger flow. Generally, lanes are set up in the corridors to facilitate ground bus transportation; they have independent road rights and high speed and transportation efficiency, which can greatly improve the service level and further increase the passenger flow.
The bus corridor has economic, social, and environmental benefits and improves public transportation services, reduces passenger travel time, enhances passenger travel experience, and eases urban congestion. Additionally, it can realize the aggregation of passenger flow and city activities, improve the efficiency and attractiveness of bus travel, and optimize the structure of urban travel. Land and real estate development on both sides of the corridor can increase the land value along the corridor, thus bringing economic benefits. For example, several projects where implemented on the nodes and areas surrounding the bus rapid transit (BRT) East Line in Pittsburgh, USA, with a total investment of 302 million US dollars. Suzuki et al. found that the construction of the BRT corridor prompted a 30% increase in the real-estate prices, thus promoting urban economic development and the formation of social spaces [9]. Cervero and Dai combined BRT and transit-oriented development and, based on case studies in Bogota, Curitiba, Seoul, and other cities [10,11], proved that BRT can improve transportation in crowded cities and enhance land-development intensity along the BRT corridor. Most studies on public transport corridors analyzed the impact of the centralized transportation system on the potential land-use changes [12] and the impact of traffic corridors on urban areas and urban forms [13].
It is necessary to identify urban bus corridor locations and decide their development mode.

Literature Review
For the center transit-oriented development, the urban public transportation corridors are divided into three basic types: the destination connection, commuting, and block circulation corridors [14]. The destination connection corridors mainly connect the residential areas with various activity centers, and the commuting corridors mainly serve the core activity centers of the city and residential areas during morning and evening peak hours; the block circulation corridors mainly connect the major functional areas in a certain area, such as the city center, medical functional areas, and educational functional areas; and the main purpose of the bus corridor is to connect the main functional areas of the city. In locating a bus corridor, the functions and population distribution on both sides of urban roads need to be considered. Common urban bus corridor location methods are empirical judgment, travel expectation, and two-step clustering methods [15][16][17][18][19][20]. The empirical judgment method determines the direction of the corridor based on the population and job position distribution of the area; owing to its limitations, it is often used as a preliminary analysis for the judgment of the bus corridor. In the travel expectation method, the bus corridor location is typically determined in four steps: first, the occurrence and attraction of traffic is predicted according to the land-planning model; second, the travel origin-destination (OD) matrix is obtained by using the traffic-distribution method; third, the bus travel volume is divided by the transportation mode; fourth, passenger flow is distributed on the road network through the passenger flow distribution method, and the direction of the traffic corridor is determined according to the size of the passenger flow on each road [15]. A geographic information system is always used to identify the transit corridor with the travel expectation method [16,17]. In the two-step clustering method, the demand points are first clustered in order to reduce the number of demand points; then the passenger flow between the demand points is determined through the clustering method, and, finally, the direction of the traffic corridor is determined [18]. Yu Shijun used Sustainability 2020, 12, 7167 3 of 22 the concepts of effect field and medium in the field to consider the population density and travel generation density and established an effect field function, to determine the urban traffic corridor [19]. Kong Zhe improved the two-step clustering and travel expectation methods and designed a direction determination and supporting road two-stage method for dynamic clustering, to improve the accuracy of bus-corridor identification [20].
Most research identified corridors by using only OD demand and clustering techniques. For example, the algorithm TraClus−DL is aimed to identify what is called the "demand corridor." Some researchers used this algorithm to evaluate, identify, or optimize collective transportation systems, such as bus or train lines [21][22][23]. Qiu et al. proposed a spatial clustering−based algorithm (P−DN), based on OD demand, which makes it possible to obtain the desired cluster lines so that the main bus corridors can be identified [24]. The key elements for bus-corridor-location identification are the urban key nodes and their links [15]. In existing research, the key nodes of the bus corridor are mostly based on known bus nodes or transportation analysis zones [15][16][17][18][19][20]. Few studies have focused on the whole urban area, and there is also the lack on the identification method of key nodes. With the development of data acquisition techniques, some data sources such as cellular signaling data and smart card data are used in urban transit planning. TOD concept is also considered in the identification of the urban active areas [25,26]. However, in studies on the location of bus corridors, less consideration is given to the organic combination with the actual road structure, the economic performance, the activity, and the land-use patterns along the corridor, which are the key elements reflecting the urban aggregation. Therefore, the positive impact of bus corridors on urban development is limited.
In addition, urban traffic corridors are developed as important channels within cities. Traffic demands and population distribution are clustered along bus corridors, and there is a strong demand for public transportation facilities. Meanwhile, in the corridor-integrated bus transit design strategy, a dedicated bus lane is created, with other design elements, for effectively improving the overall transport efficiency of the corridor. This, along with the comprehensive optimization of bus lanes, especially the development mode of bus lanes in bus corridors, can improve the efficiency of bus corridors.
In this study, we considered the Beijing urban area for our research and used a clustering algorithm to identify areas with clustering effects. Based on multidimensional data and OD, we analyzed the influence of multiple factors such as land use, population distribution, economic development, and social activities on the location of bus corridors. With the objective of maximizing the aggregation effect of bus corridors and minimizing the setting cost of these corridors, we identified locations for bus corridors with the gathering and attraction effects.
This study considered the Beijing urban area as an example, to verify the two-stage bus-corridoridentification method proposed in this paper. The main contributions of this study are as follows: (a) In corridor identification, the influence of land use, activity density, and other practical factors on the distribution of the corridors is carefully considered.
(b) In the corridor identification method, according to the actual situation of urban roads, the development intensity and development plans of the transportation facilities in the corridor are determined.
The remainder of this paper is structured as follows. In Section 3, we describe the comprehensive optimization problem involving the identification of urban bus corridors and setting of bus lanes. In Section 4, we introduce the node identification method and mathematical optimization model for the development of corridors and internal transportation facilities. In Section 5, we explain a case study of the Beijing downtown area, in which we identified the locations with peak traffic hours in the morning and evening. In Section 6, we outline the conclusions of this study.

Problem Description
In this study, we comprehensively optimized the methods for setting bus corridors and bus lanes by considering the characteristics of the nodes along the corridors. The problem can be described as follows: making a decision on whether to set up a bus corridor in the urban road network to improve the Sustainability 2020, 12, 7167 4 of 22 transportation efficiency and economic utility of the corridor, while considering the capacity constraints of the road sections. Figure 1 shows an urban road network with potential Passenger Demand Points 1-8, which can be connected through urban road paths. The capacity for each road section is also given in the figure. The assumed OD demands for the road network are presented in Table 1. The candidate lines were calculated by using the K-shortest path algorithm. Considering OD Demand Points 1→8 as an example, when K = 2, we obtained candidate lines 1→3→6→8 and 1→2→5→8. Generally, if the transportation capacity of the paths cannot meet the OD requirements, we must necessarily determine whether to set these paths as bus corridors.
If the road path is identified as a bus corridor, the capacity of the road path should be increased considering local conditions. If the road path is in the city center and cannot meet the conditions for road reconstruction and expansion, traffic management methods should be used to set a certain lane as a dedicated bus lane. If the identified road path meets the conditions for road reconstruction and expansion, an independent bus lane can be added. Improvement in the transportation capacity of the road path may affect the passenger flow in other transportation modes on the service arcs and introduce identification costs. However, the recognition of these service arcs as public transport corridors can enhance their attractiveness toward the service areas along the route. By balancing the cost of the bus corridor development and the effectiveness of the bus corridor, the passenger demand in the road network can be distributed to each road segment efficiently. Therefore, selecting a suitable road path for development as a bus corridor to better allocate passenger flow in the road network is fundamental to this problem.
Sustainability 2020, 12, x FOR PEER REVIEW 4 of 24 the transportation efficiency and economic utility of the corridor, while considering the capacity constraints of the road sections, Figure 1 shows an urban road network with potential Passenger Demand Points 1-8, which can be connected through urban road paths. The capacity for each road section is also given in the figure. The assumed OD demands for the road network are presented in Table 1. The candidate lines were calculated by using the K-shortest path algorithm. Considering OD Demand Points 1→8 as an example, when K = 2, we obtained candidate lines 1→3→6→8 and 1→2→5→8. Generally, if the transportation capacity of the paths cannot meet the OD requirements, we must necessarily determine whether to set these paths as bus corridors.
If the road path is identified as a bus corridor, the capacity of the road path should be increased considering local conditions. If the road path is in the city center and cannot meet the conditions for road reconstruction and expansion, traffic management methods should be used to set a certain lane as a dedicated bus lane. If the identified road path meets the conditions for road reconstruction and expansion, an independent bus lane can be added. Improvement in the transportation capacity of the road path may affect the passenger flow in other transportation modes on the service arcs and introduce identification costs. However, the recognition of these service arcs as public transport corridors can enhance their attractiveness toward the service areas along the route. By balancing the cost of the bus corridor development and the effectiveness of the bus corridor, the passenger demand in the road network can be distributed to each road segment efficiently. Therefore, selecting a suitable road path for development as a bus corridor to better allocate passenger flow in the road network is fundamental to this problem.
Urban key node Bus corridor Urban road

Methodology
The framework of the methodology is shown in Figure 2. The methodology is divided into two steps. We first consider the functional nodes of the entire city as a candidate set, and the main service area of the bus corridor is determined by using a clustering algorithm. Second, an integrated design for the direction of the bus corridor and the development of bus facilities inside the corridor are established through a mathematical model.

Methodology
The framework of the methodology is shown in Figure 2. The methodology is divided into two steps. We first consider the functional nodes of the entire city as a candidate set, and the main service area of the bus corridor is determined by using a clustering algorithm. Second, an integrated design for

Sets
In this study, G = (N, A) is a simplified urban road network, consisting of nodes and road sections between two points; N is a collection of stations in the urban road network, indexed by n∈N; D is the bus passenger flow OD collection, indexed by d∈D; A is a collection of road sections in the urban road network, indexed by a∈A; P is a collection of paths passed by passengers in the urban road network, indexed by p∈P; Np is a collection of nodes in path p, where Np ⊆ N; Ap is a collection of arcs in path p, where Ap ⊆ A; R is an expansion plan of the stations in the urban road network, indexed by r∈R; Q represents the alternatives for the section of the urban road network to be included in the urban bus corridor, indexed by q∈; Qu ⊆ Q is a collection of development plans in which sections of the urban road network are included in the urban bus corridor; U is a collection of super sections in the urban road network to allocate unmet needs, indexed by u U P ∈ ⊆ ; T is a collection of city institution types, indexed by t∈T; and t is divided into residential, office, financial, catering, education, and medical institutions. θ is road arc a set as the unit distance cost of the bus corridor development plan q; ϑ a is the average plot ratio of surrounding plots covered by arc a, plot ratio is the ratio of floor space in a building to lot size; ϑ p is the average plot ratio of surrounding plots covered by path p; p l is the transportation distance of path p; p Φ is the

Sets
In this study, G = (N, A) is a simplified urban road network, consisting of nodes and road sections between two points; N is a collection of stations in the urban road network, indexed by n ∈ N; D is the bus passenger flow OD collection, indexed by d ∈ D; A is a collection of road sections in the urban road network, indexed by a ∈ A; P is a collection of paths passed by passengers in the urban road network, indexed by p ∈ P; N p is a collection of nodes in path p, where N p ⊆ N; A p is a collection of arcs in path p, where A p ⊆ A; R is an expansion plan of the stations in the urban road network, indexed by r ∈ R; Q represents the alternatives for the section of the urban road network to be included in the urban bus corridor, indexed by q ∈; Q u ⊆ Q is a collection of development plans in which sections of the urban road network are included in the urban bus corridor; U is a collection of super sections in the urban road network to allocate unmet needs, indexed by u ∈ U ⊆ P; T is a collection of city institution types, indexed by t ∈ T; and t is divided into residential, office, financial, catering, education, and medical institutions.

Parameters
In this study, Q d is the passenger traffic flow of the d-th OD; δ a p ∈ {0, 1} and δ n p ∈ {0, 1} are parameters that determine whether path p passes through arc a and node n, respectively; T d is the longest travel time acceptable for the d-th OD demand; T stop is the stop time; c a is the unit time cost of passengers on arc a; t a is the transportation time of arc a; l a is the transportation distance of arc a; ω a is the transportation capacity of arc a; ρ q a is road arc a set as the newly added bus passenger flow after the bus corridor development plan q; θ q a is road arc a set as the unit distance cost of the bus corridor development plan q; ϑ a is the average plot ratio of surrounding plots covered by arc a, plot ratio is the ratio of floor space in a building to lot size; ϑ p is the average plot ratio of surrounding plots covered by path p; l p is the transportation distance of path p; Φ p is the route diversification level of path p; Ω p is the gravitational index along path p; Ω ij is the gravity index between two nodes i, j, i, and j ∈ N; λ n is the mixed land use index at node n; and V n is the node value of node n.

Key Node Identification
Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) is an efficient density clustering algorithm developed by Campello et al. [27]. It is a combination of the DBSCAN algorithm and the hierarchical clustering algorithm. This method improves on the shortcomings of DBSCAN, using an ordering point to identify the cluster structure, and other density clustering algorithms, which include difficulties in determining parameters such as the radius of the neighborhood and the minimum number of clusters, mainly through a new method called mutual reachability distance. The HDBSCAN software package (version 0.8.12), under the Python language (version 2.7.5), was used, and the clustering results were displayed by using the Geographic Information System (GIS) software. For application level researchers, the HDBSCAN method generally only requires the specification of the minimum cluster number (to minimize the cluster size), but the concepts of core point, neighborhood radius, and minimum cluster number are inherited from the DBSCAN and other density clustering methods. In this study, the commercial buildings in the city were clustered in order to obtain the city key nodes based on the crawled point of interest (POI) data through the HDBSCAN algorithm.

Key Node Evaluation
Traffic corridors mainly serve areas generally characterized by high density, diversity, and agglomeration, where the urban population and economic activity are concentrated. Therefore, most of these areas have a high potential travel demand. Greater population, building, and employment densities lead to a more obvious clustering effect [28], and a higher degree of diversification leads to a richer urban function. In addition, the level of economic development also suggests the activity of the region to a certain extent [29,30].
Based on Reference [26], we analyzed the factors that help evaluate the key nodes of the city. As shown in Table 2, nine indicators are selected to evaluate the key nodes through a spatial multicriteria analysis. Density is measured by different types of institution density, which could reflect the aggregation of job opportunities and land use intensity. Diversity is used to measure the mixed land use because a developed area should be sufficient in multiple functions [31]. Land value is used to judge the development of the economy, for places with high land value always attract commerce and trade. Population is used to measure the static distribution of humans. Human vitality is measured by human activity frequency. The check-in times of social Applications (APPs) reflect the dynamic human performance and their preference.
The value of the key node n, V n , is mainly a comprehensive reflection of density, functional diversity, land value, and population, and the weight of each indicator is obtained in Reference [26].
There is always a multi-scaled spatial phenomenon when defining a key node. The common research area has a 700-1500 m buffer around key nodes [32][33][34], for it is a reasonable distance is within an 8-10-min walk. Considering the average stop spacing, in our study, we performed analysis in a 1000 m buffer for each node. The land on both sides of bus corridors is generally characterized by a high plot ratio and high-density composite development and mixed land use. Stations such as Shinjuku and Ginza in Tokyo have plot ratios as high as 10-15. In Curitiba, the plot ratio is higher toward the bus corridor, and land development outside two blocks is restricted. The intensity of the land development along the corridor affects the building and population density. A higher intensity of land development results in greater potential travel demand.
In view of the imbalance and complex statistics of the population and employment distribution, we selected the plot ratio to measure the land-development intensity along the road. As a comprehensive control index of urban land use, the plot ratio is a measure of the urban building density and building height, which characterizes the intensity of land use for development. Areas with a high plot ratio generate a larger demand for passenger flow, which directly affects the distribution of the bus corridors and traffic nodes.
In this paper, we define ϑ p as the average plot ratio of the plot passed by path p, which could describe the land development along the corridor:

Corridor Gravitational Effect
The flow of information, people, and objects between different regions of a city is a measure of the connectivity between these regions, which is important for guiding urban planning. The research is conducted in two ways: by quantifying the city connections based on dynamic data, including communication, OD data, and social networks, and by measuring the relationship between cities through a gravity model, and measuring the attractiveness between two nodes through the population ratio of the two nodes and the distance function between the two nodes. The gravity model is generally used to calculate the attractive force between two points, and the nodes are constructed based on the traditional gravity model. The gravity index Ω ij , between the nodes i and j, which denote two different bus stations, is calculated as follows: where V i and V j are the values of nodes i and j, respectively; f ij is the cost of traveling between nodes i and j, usually expressed in units of distance; and k, α, and β are constants.
The gravity index assumes that the attractive force between the two nodes is only related to the travel cost between the two nodes in contact. A higher gravitational index results in a higher attraction between two nodes, and a higher travel cost between two nodes results in a lower attractive force. The set of nodes in path p is denoted as N p , and the number of nodes as K; the gravity index along path p from node i to node j can be expressed as follows: where C 2 K indicates that there are C 2 K arrangements and combinations of K nodes, and Ω p is the average value of the gravitational index between all traffic nodes on path p. Figure 3 shows an example of the path attraction effect, where the path p passes through four nodes, i.e., K = 4. Then, the gravity index Ω p along the line is the average value of Ω 12 , Ω 13 , Ω 14 , Ω 23 , Ω 24 , and Ω 34 .
Sustainability 2020, 12, x FOR PEER REVIEW 8 of 24 The gravity index assumes that the attractive force between the two nodes is only related to the travel cost between the two nodes in contact. A higher gravitational index results in a higher attraction between two nodes, and a higher travel cost between two nodes results in a lower attractive force.
The set of nodes in path p is denoted as p N , and the number of nodes as K; the gravity index along path p from node i to node j can be expressed as follows: where 2 C K indicates that there are 2 C K arrangements and combinations of K nodes, and p Ω is the average value of the gravitational index between all traffic nodes on path p. Figure 3 shows

Corridor Functional Diversity
As a corridor for the gathering and circulation of the passengers and logistics, information flow, and commercial flow, the bus corridor is a bridge connecting the main functional areas in the city. Along the corridor, there are many complete commercial, entertainment, cultural, technological, and other facilities with diversified functions to ensure the efficient transportation of many passengers, thus enhancing the attractiveness and commercial value of the bus corridor.
The bus corridor connects the key nodes of the city, and the area around each station should gather as many urban functions as possible, such as residential, commercial, industrial, entertainment, education, and medical. The mixed use of land in the area surrounding the stations reduces the travel demand between different areas, thereby reducing the demand for motor vehicle travel and promoting healthy and green travel modes, such as walking and cycling. Therefore, the land mixed-use index of every key node along the bus corridor should be as high as possible. The land mixed-use index λ n at node n can be calculated as follows [35]: where t is the distribution of functions around the site, which can be residential, office, finance, education, and medical care functions, and t n P is the proportion of land function types t in the area around node n. In this study, we define the average diversification level along path p as the diversification index along path p:

Corridor Functional Diversity
As a corridor for the gathering and circulation of the passengers and logistics, information flow, and commercial flow, the bus corridor is a bridge connecting the main functional areas in the city. Along the corridor, there are many complete commercial, entertainment, cultural, technological, and other facilities with diversified functions to ensure the efficient transportation of many passengers, thus enhancing the attractiveness and commercial value of the bus corridor.
The bus corridor connects the key nodes of the city, and the area around each station should gather as many urban functions as possible, such as residential, commercial, industrial, entertainment, education, and medical. The mixed use of land in the area surrounding the stations reduces the travel demand between different areas, thereby reducing the demand for motor vehicle travel and promoting healthy and green travel modes, such as walking and cycling. Therefore, the land mixed-use index of every key node along the bus corridor should be as high as possible. The land mixed-use index λ n at node n can be calculated as follows [35]: where t is the distribution of functions around the site, which can be residential, office, finance, education, and medical care functions, and P t n is the proportion of land function types t in the area around node n. In this study, we define the average diversification level along path p as the diversification index along path p:

Mathematical Model
For constructing a mathematical optimization model for urban bus corridor location identification (UBCLI), the network topology, road attributes, and other factors should be known.
To simplify the research and make the model more practical, we assume that the function distribution and plot ratio on both sides of the road are known and that the passenger flow demand within the network is fixed and known.
The objective function of the model also considers the cost of setting up the corridor, the cost of pedestrian travel, and the aggregation utility of the corridor.

Generalized Cost
The cost of the UBCLI model must consider not only the pedestrian travel cost but also the cost of the development of the bus corridor.
The main purpose of establishing the bus corridor is to improve the efficiency of pedestrian travel and minimize the travel time. The bus corridor not only serves the economic development of the city, but also should consider the convenience of pedestrian travel. The travel cost of pedestrian travel C 1 can be expressed as follows: If arc a is included in the candidate set of bus corridors, the corridor development cost is as follows: The generalized cost function F 1 is the sum of the pedestrian travel cost C 1 and corridor setup cost C 2 .

Corridor Aggregation Utility
The bus corridor has the characteristics of intensive land use and clustering. In this study, we calculated the utility of the corridor considering the gravitational and diversified utilities.

Corridor Gravitational Utility
We used the gravity index along the path to define the utility of gravity and measure the attractiveness of the key nodes of the city. The total gravitational utility of the selected path is as follows: Corridor Diversified Utility The diversification of the bus corridor is divided into the diversification level of the route and the diversification level of the functions of the key nodes along the route. The corridor diversification utility value is the diversification level along the bus corridor.
There are three objective functions of the UBCLI model, namely F 1 , F 2 , and F 3 , which minimize the cost function and maximize the utility function. Because different objective functions have different dimensions, the objective function needs to be normalized. Based on Reference [36], we transformed the multi objective programming problem into a single-objective programming problem, as expressed in (11) and (12). F 1max , F 2max , and F 3max are the maximize value of function F 1 , F 2 , and F 3 , which need to be calculated in advance.
where ω 1 , ω 2 ,and ω 3 are the weight coefficients of the target functions F 1 , F 2 , and F 3 , respectively, and F is the value of the target function after normalization. Constraint (13) of the UBCLI model forces each OD to be transported through a path; Constraint (14) restricts the development plan selected by the bus corridor, so that each corridor can only select one development plan; Constraint (15) considers whether to establish the service capacity constraint of the path under the bus corridor; Constraint (16) is the timeliness constraint of the passenger travel path, which means that the total running time of the opened route cannot exceed the longest tolerable time for passenger travel, including the transit time and stopping time; Constraint (17) requires that the average plot ratio of the block where the bus corridor is located be within the limit of the plot ratio in urban planning; and Constraint (18) is a constraint on the relationship of the decision variables. If road arc a is selected as the bus corridor in the selected set, there must be a path including road arc a through which the passenger flow passes; Constraints (19)- (21) are the constraints of decision variables.

Path Set Construction Method
In this section, a candidate set of bus corridors is defined by using the K-shortest path algorithm. The steps of the algorithm are as follows: First, the shortest path is calculated. Compared with other shortest path search algorithms, the Dijskra algorithm has the characteristics of lower time complexity and better suitability for large-scale network graphs. Therefore, this algorithm is suitable for the service network studied here. The Dijskra algorithm is used to search for the service path of the transportation demand. If the shortest path time obtained does not meet the transportation time-limit requirements of the passenger flow OD, the alternative path of the OD is set as a super path (i.e., a virtual path is set to avoid lowering efficiency when the algorithm has no solution).
Second, all the nodes of the shortest path are traversed in reverse order, from the end node to the starting node, and taboo information is added to the sections of each adjacent node, in turn. After each taboo, the Dijskra algorithm is used to find the current shortest path as a candidate shortest path and add it to the candidate set in

Path Set Construction Method
In this section, a candidate set of bus corridors is defined by using the K-shortest path algorithm. The steps of the algorithm are as follows: First, the shortest path is calculated. Compared with other shortest path search algorithms, the Dijskra algorithm has the characteristics of lower time complexity and better suitability for largescale network graphs. Therefore, this algorithm is suitable for the service network studied here. The Dijskra algorithm is used to search for the service path of the transportation demand. If the shortest path time obtained does not meet the transportation time-limit requirements of the passenger flow OD, the alternative path of the OD is set as a super path (i.e., a virtual path is set to avoid lowering efficiency when the algorithm has no solution).
Second, all the nodes of the shortest path are traversed in reverse order, from the end node to the starting node, and taboo information is added to the sections of each adjacent node, in turn. After each taboo, the Dijskra algorithm is used to find the current shortest path as a candidate shortest path and add it to the candidate set in Ώ; once the traversal is completed, the shortest path is calculated in the candidate set Ώ. If the calculated shortest path time meets the transportation time limit of the passenger flow OD, it is considered a secondary short circuit; otherwise, the number of feasible paths has not reached K.
Third, the candidate set Ώ is cleared. The same method used in the second step is used to traverse the secondary short circuit; the current shortest path obtained after each taboo is added as the candidate shortest path into the candidate set Ώ once the traversal is completed, from the candidate set Ώ to select the third shortest path that meets the OD transportation time limit of the passenger flow; otherwise, the number of feasible paths has not reached K, and the current K-shortest path search is ended. The third step is repeated until all K-shortest path sets of the passenger flow OD are found.

Model Solution Method
; once the traversal is completed, the shortest path is calculated in the candidate set

Method
Set Construction Method section, a candidate set of bus corridors is defined by using the K-shortest path algorithm. f the algorithm are as follows: he shortest path is calculated. Compared with other shortest path search algorithms, the orithm has the characteristics of lower time complexity and better suitability for largerk graphs. Therefore, this algorithm is suitable for the service network studied here. The orithm is used to search for the service path of the transportation demand. If the shortest btained does not meet the transportation time-limit requirements of the passenger flow ernative path of the OD is set as a super path (i.e., a virtual path is set to avoid lowering hen the algorithm has no solution). , all the nodes of the shortest path are traversed in reverse order, from the end node to node, and taboo information is added to the sections of each adjacent node, in turn. After the Dijskra algorithm is used to find the current shortest path as a candidate shortest path o the candidate set in Ώ; once the traversal is completed, the shortest path is calculated in te set Ώ. If the calculated shortest path time meets the transportation time limit of the low OD, it is considered a secondary short circuit; otherwise, the number of feasible paths ched K. the candidate set Ώ is cleared. The same method used in the second step is used to traverse ary short circuit; the current shortest path obtained after each taboo is added as the hortest path into the candidate set Ώ once the traversal is completed, from the candidate lect the third shortest path that meets the OD transportation time limit of the passenger wise, the number of feasible paths has not reached K, and the current K-shortest path ded. The third step is repeated until all K-shortest path sets of the passenger flow OD are . If the calculated shortest path time meets the transportation time limit of the Second, all the nodes of the shortest path are traversed in reverse order, from the end node to he starting node, and taboo information is added to the sections of each adjacent node, in turn. After ach taboo, the Dijskra algorithm is used to find the current shortest path as a candidate shortest path nd add it to the candidate set in Ώ; once the traversal is completed, the shortest path is calculated in he candidate set Ώ. If the calculated shortest path time meets the transportation time limit of the assenger flow OD, it is considered a secondary short circuit; otherwise, the number of feasible paths as not reached K.
Third, the candidate set Ώ is cleared. The same method used in the second step is used to traverse he secondary short circuit; the current shortest path obtained after each taboo is added as the andidate shortest path into the candidate set Ώ once the traversal is completed, from the candidate et Ώ to select the third shortest path that meets the OD transportation time limit of the passenger low; otherwise, the number of feasible paths has not reached K, and the current K-shortest path earch is ended. The third step is repeated until all K-shortest path sets of the passenger flow OD are ound. .

Model Solution Method
The UBCLI model established in this section is a 0−1 linear programming optimization model, hich is an Non-deterministic Polynomial hard (NP-hard) problem. According to existing algorithm heory, it is difficult to find an accurate algorithm to obtain the optimal solution for the NP-hard roblem in polynomial time. However, because the UBCLI model contains 2 () On variables and 4 () n constraints, the scale of the UBCLI model grows polynomially only with the passenger flow emand, OD, network sections, paths, and nodes. Preliminary calculation and test results show that any current mainstream commercial optimization software packages based on this model (e.g., IBM LOG Cplex, Gurobi, and GAMS) can solve large-scale calculation examples, optimally, in a short ime, on ordinary computers. In this study, we used the IBM ILOG Cplex software, which has builtn branch and bound algorithms, simplex methods, and other operation research optimization lgorithm solvers that can give the optimal solution to the linear optimization model and quickly olve the mathematical optimization problem.
. Case Study is cleared. The same method used in the second step is used to traverse the secondary short circuit; the current shortest path obtained after each taboo is added as the candidate shortest path into the candidate set OD, the alternative path of the OD is set as a super path (i.e., a virtual path is set to avoid lowering 372 efficiency when the algorithm has no solution).

373
Second, all the nodes of the shortest path are traversed in reverse order, from the end node to 374 the starting node, and taboo information is added to the sections of each adjacent node, in turn. After 375 each taboo, the Dijskra algorithm is used to find the current shortest path as a candidate shortest path 376 and add it to the candidate set in Ώ; once the traversal is completed, the shortest path is calculated in 377 the candidate set Ώ. If the calculated shortest path time meets the transportation time limit of the 378 passenger flow OD, it is considered a secondary short circuit; otherwise, the number of feasible paths 379 has not reached K.

380
Third, the candidate set Ώ is cleared. The same method used in the second step is used to traverse 381 the secondary short circuit; the current shortest path obtained after each taboo is added as the 382 candidate shortest path into the candidate set Ώ once the traversal is completed, from the candidate 383 set Ώ to select the third shortest path that meets the OD transportation time limit of the passenger

401
to select the third shortest path that meets the OD transportation time limit of the passenger flow; otherwise, the number of feasible paths has not reached K, and the current K-shortest path search is ended. The third step is repeated until all K-shortest path sets of the passenger flow OD are found.

Model Solution Method
The UBCLI model established in this section is a 0−1 linear programming optimization model, which is an Non-deterministic Polynomial hard (NP-hard) problem. According to existing algorithm theory, it is difficult to find an accurate algorithm to obtain the optimal solution for the NP-hard problem in polynomial time. However, because the UBCLI model contains O(n 2 ) variables and O(n 4 ) constraints, the scale of the UBCLI model grows polynomially only with the passenger flow demand, OD, network sections, paths, and nodes. Preliminary calculation and test results show that many current mainstream commercial optimization software packages based on this model (e.g., IBM ILOG Cplex, Gurobi, and GAMS) can solve large-scale calculation examples, optimally, in a short time, on ordinary computers. In this study, we used the IBM ILOG Cplex software, which has built-in branch and bound algorithms, simplex methods, and other operation research optimization algorithm solvers that can give the optimal solution to the linear optimization model and quickly solve the mathematical optimization problem.

Case Study
Beijing is a typical city with a high population density and large squares. By the end of 2019, the population reached 21.53 million, and the population density in the central parts reached 8135 person/km 2 . Beijing has a high immigrant population, and the flow of people is relatively large. The population tends to move out of the city center, leading to high commuter traffic.
Although Beijing has the largest public transportation system in the world, it still has the problem of long commuting times, owing to insufficient service levels and transportation networks. In particular, the subway capacity in Beijing cannot meet the daily travel requirements of residents. Thus, to improve the travel efficiency of residents, it is necessary to further improve the rapid transit network and supplement public transportation capacity. However, because of the high population density in the central urban area of Beijing, the old urban area is difficult to renovate. To operate BRT in the central urban area, it is necessary to develop measures suiting local conditions and maximize the level of public transport services through optimum utilization of road resources and by reducing costs.

Key Node Analyses
In these analyses, the HDBSCAN algorithm was used to cluster commercial housing in the 12 districts of Beijing, and 68 important nodes in the city were obtained (see Figure 4). Figure 5 shows the network topology. Demand Points 0-67 are the key nodes selected, and Nodes 68-183 are urban road intersections; the traffic requirements of the urban road intersections are not considered. the level of public transport services through optimum utilization of road resources and by reducing costs.

Key Node Analyses
In these analyses, the HDBSCAN algorithm was used to cluster commercial housing in the 12 districts of Beijing, and 68 important nodes in the city were obtained (see Figure 4).   The POI data were collected from Gaode maps (https://ditu.amap.com/). The population density was obtained through the 2010 census data in Beijing. The housing price was collected through a In these analyses, the HDBSCAN algorithm was used to cluster commercial housing in the 12 districts of Beijing, and 68 important nodes in the city were obtained (see Figure 4).   The POI data were collected from Gaode maps (https://ditu.amap.com/). The population density was obtained through the 2010 census data in Beijing. The housing price was collected through a The POI data were collected from Gaode maps (https://ditu.amap.com/). The population density was obtained through the 2010 census data in Beijing. The housing price was collected through a house-selling website (https://bj.lianjia.com/ershoufang/). The human activity frequency data were collected through https://www.sina.com.cn/.
We measured the value of key node within a 1000 m buffer, as the average station spacing is approximately 1000 m. Figure 6 shows the distribution of the key node values. It illustrates that nodes with higher node values are mostly gathered in the city center. house-selling website (https://bj.lianjia.com/ershoufang/). The human activity frequency data were collected through https://www.sina.com.cn/. We measured the value of key node within a 1000 m buffer, as the average station spacing is approximately 1000 m. Figure 6 shows the distribution of the key node values. It illustrates that nodes with higher node values are mostly gathered in the city center.

Result of Bus Corridor Distribution
The distribution of the city bus corridor is solved based on the passenger flow distribution during the morning and evening peak hours. We studied the OD passenger flow during peak hours on November 30, 2015. The peak period is generally the period with the largest bus passenger flow of the day. To verify the influence of the morning and evening peak-hour passenger flow directions on the bus corridor, we analyzed the morning peak hours (7:30-8:30) and evening peak hours (17:30-18:30) passenger flows. Figures 7 and 8 show the distribution of the OD passenger flow in the morning and evening peak hours, respectively. The clockwise direction of the arc is the direction of passenger flow.

Result of Bus Corridor Distribution
The distribution of the city bus corridor is solved based on the passenger flow distribution during the morning and evening peak hours. We studied the OD passenger flow during peak hours on 30 November 2015. The peak period is generally the period with the largest bus passenger flow of the day. To verify the influence of the morning and evening peak-hour passenger flow directions on the bus corridor, we analyzed the morning peak hours (7:30-8:30) and evening peak hours (17:30-18:30) passenger flows. Figures 7 and 8 show the distribution of the OD passenger flow in the morning and evening peak hours, respectively. The clockwise direction of the arc is the direction of passenger flow. Table 3 presents the bus corridor development plan. In particular, three corridor development plans are presented; they are mainly used when the corridor transportation capacity cannot be met.
We measured the value of key node within a 1000 m buffer, as the average station spacing is approximately 1000 m. Figure 6 shows the distribution of the key node values. It illustrates that nodes with higher node values are mostly gathered in the city center.

Result of Bus Corridor Distribution
The distribution of the city bus corridor is solved based on the passenger flow distribution during the morning and evening peak hours. We studied the OD passenger flow during peak hours on November 30, 2015. The peak period is generally the period with the largest bus passenger flow of the day. To verify the influence of the morning and evening peak-hour passenger flow directions on the bus corridor, we analyzed the morning peak hours (7:30-8:30) and evening peak hours (17:30-18:30) passenger flows. Figures 7 and 8 show the distribution of the OD passenger flow in the morning and evening peak hours, respectively. The clockwise direction of the arc is the direction of passenger flow.   Table 3 presents the bus corridor development plan. In particular, three corridor development plans are presented; they are mainly used when the corridor transportation capacity cannot be met. Give priority to public transit rights at designated times, through transportation management.

Plan 2 60 6000
The flow in the corridor is large, and a lane can be drawn on the urban road as a dedicated bus lane.

Plan 3 120 15,000
Corridor flow is large, and a new busdedicated road section needs to be constructed.
In this study, the arc capacity ( ω a ) is the maximum number of bus passengers that can pass in a unit time, in the ideal state, and is estimated as the road design traffic minus the car traffic. Based on the "Urban Road Engineering Design Code" (CJJ37−2012) [3] for urban road classification standards and passing capacity, the road grade is used to determine its capacity. According to actual operating experience, the ratio of the number of cars and buses in urban roads is approximately 5:1.
The conversion factor of cars and buses is 2, and the conversion factor of the bus capacity of each road arc is 0.15. Based on operating experience, 1 2 , ω ω , and 3 ω are all set to 0. 33. For any open line, the unit distance cost of bus operation is set to 2.5 yuan/km, unit distance cost of setting the bus corridor is  In this study, the arc capacity (ω a ) is the maximum number of bus passengers that can pass in a unit time, in the ideal state, and is estimated as the road design traffic minus the car traffic. Based on the "Urban Road Engineering Design Code" (CJJ37−2012) [3] for urban road classification standards and passing capacity, the road grade is used to determine its capacity. According to actual operating experience, the ratio of the number of cars and buses in urban roads is approximately 5:1. The conversion factor of cars and buses is 2, and the conversion factor of the bus capacity of each road arc is 0.15.
Based on operating experience, ω 1 , ω 2 , and ω 3 are all set to 0.33. For any open line, the unit distance cost of bus operation is set to 2.5 yuan/km, unit distance cost of setting the bus corridor is set to 500,000 yuan, stop time is set to 2 min, plot ratio is limited to 2-10, and passenger travel tolerance time is set to 1.5 times the car travel time. The ratio of the travel distance of the short path to the speed of the car is thus calculated. The speed of the vehicles is set to 25 km/h, and the passenger capacity of the vehicle is set to 100 people/car.
The K-value in the K-shortest path algorithm is obtained by testing, and K = 3 is obtained in this case. The K-shortest path algorithm designed in this section runs in a local operating environment with a 1.8 GHz CPU, 16 GB computer RAM, and is implemented by C# programming; the solution is obtained within 15 min. Table 4 presents some K-shortest path algorithm solutions. The results obtained with the K-shortest path algorithm are used as a set of candidate paths and input into the UBCLI model. The UBCLI model was implemented in the C# and IBM ILOG Cplex software on a Windows 10 PC with a 1.8 GHz CPU and 16 G BRAM. The solution accuracy gap of the IBM ILOG Cplex is set to 0.1%, and the solution is obtained within 12 min. Tables 5 and 6 present the bus corridor development plan for the morning and evening peak hours, respectively. It can be seen that most bus corridors are in Development Plan 1, and a small number of bus corridors are in Development Plans 2 and 3. Most of the bus corridor selected for Development Plans 2 and 3 are located on the roads at the edge of the city center, along roads for entering and leaving the city center. Most of the bus corridors selected for Development Plan 1 are express roads from the city center to the suburban areas, and some city centers. Figures 9 and 10 show the location of the bus corridor in Beijing during the morning and evening peak hours, respectively. From Figures 7 and 8, it can be seen that the passenger flow in the evening peak hours is significantly higher than that in the morning peak hours. The results show that the number of corridors in the evening peak hours is higher than that in the morning peak hours. Most of the bus corridors in the morning and evening peak hours are directed toward and away from the city center in radial directions, respectively. This distribution is caused by the increase in the population of Beijing, which has led to the increase in the number of commuter trips in the city center and suburbs in the morning and evening peak hours.
We define corridor exploitation intensity to describe the development intensity of the corridor. The development intensities of Plan 1, Plan 2, and Plan 3 increase in order. If both directions of a certain road are selected as public transit corridors, then the development intensities of the two plans are superimposed. It can be seen that most bus corridors are in Development Plan 1, and a small number of bus corridors are in Development Plans 2 and 3. Most of the bus corridor selected for Development Plans 2 and 3 are located on the roads at the edge of the city center, along roads for entering and leaving the city center. Most of the bus corridors selected for Development Plan 1 are express roads from the city center to the suburban areas, and some city centers. Figures 9 and 10 show the location of the bus corridor in Beijing during the morning and evening peak hours, respectively. From Figures 7 and 8, it can be seen that the passenger flow in the evening peak hours is significantly higher than that in the morning peak hours. The results show that the number of corridors in the evening peak hours is higher than that in the morning peak hours. Most of the bus corridors in the morning and evening peak hours are directed toward and away from the city center in radial directions, respectively. This distribution is caused by the increase in the population of Beijing, which has led to the increase in the number of commuter trips in the city center and suburbs in the morning and evening peak hours.
We define corridor exploitation intensity to describe the development intensity of the corridor. The development intensities of Plan 1, Plan 2, and Plan 3 increase in order. If both directions of a certain road are selected as public transit corridors, then the development intensities of the two plans are superimposed.  It can also be seen that there is high passenger flow and dense population and activity distribution in the urban center ring road during the morning and evening peak hours. Therefore, some roads should be built with dedicated roads or large-volume public transportation modes (e.g., subway) to reduce the flow of passengers. It can also be seen that there is high passenger flow and dense population and activity distribution in the urban center ring road during the morning and evening peak hours. Therefore, some roads should be built with dedicated roads or large-volume public transportation modes (e.g., subway) to reduce the flow of passengers. Figures 11 and 12 show the flow distribution during the morning and evening peak hours, respectively. The bus passenger flow in the morning and evening peak hours mainly presents a radial distribution. A high passenger flow is observed in the morning peak hours from the outside of the city toward the city center; the opposite flow direction is observed in the evening peak hours.
In the corridor set in the development Plans 1, 2, and 3, the average one-way cross-section passenger flow of the road is 6671, 10,077, and 15,551 people/h, respectively, which means that, when the passenger flow reaches 6671, the road should give priority to public transit rights during peak hours through transportation management. When the passenger flow reaches 10,077, a bus lane should be set up in the roads. As the passenger flow reaches 15,551, a new bus road with an independent right of way should be built.
The HDBSCAN clustering algorithm for the case of Beijing indicates that, in addition to identifying areas with high population densities in the city center, some areas with higher densities in the suburbs of the city are also identified. Compared to existing research based on experience or long-term city planning to determine the requirements of public transportation service areas, this method adopts a more scientific approach. It can also be seen that there is high passenger flow and dense population and activity distribution in the urban center ring road during the morning and evening peak hours. Therefore, some roads should be built with dedicated roads or large-volume public transportation modes (e.g., subway) to reduce the flow of passengers. Figures 11 and 12 show the flow distribution during the morning and evening peak hours, respectively. The bus passenger flow in the morning and evening peak hours mainly presents a radial distribution. A high passenger flow is observed in the morning peak hours from the outside of the city toward the city center; the opposite flow direction is observed in the evening peak hours.  In the corridor set in the development Plans 1, 2, and 3, the average one-way cross-section passenger flow of the road is 6671, 10,077, and 15,551 people/h, respectively, which means that, when the passenger flow reaches 6671, the road should give priority to public transit rights during peak hours through transportation management. When the passenger flow reaches 10,077, a bus lane should be set up in the roads. As the passenger flow reaches 15,551, a new bus road with an independent right of way should be built.
The HDBSCAN clustering algorithm for the case of Beijing indicates that, in addition to identifying areas with high population densities in the city center, some areas with higher densities in the suburbs of the city are also identified. Compared to existing research based on experience or long-term city planning to determine the requirements of public transportation service areas, this method adopts a more scientific approach.

Conclusions
We developed and optimized an identification method for urban bus corridors location and transportation facilities development. Bus corridors promote social and economic activities and induce diversified land use along their path. The objective of the identification method is to minimize the cost of bus corridor identification and increase the diversification index and agglomeration effect along the path. We developed a mathematical model, while considering the constraints of the corridor development plan, intensity of land development along the path, and maximum travel time that passengers can tolerate. A K-shortest path algorithm was designed to find the candidate path set for the urban bus corridors, and the optimization model was accurately solved by using the IBM ILOG Cplex software. The following are the three major conclusions of this study: (1) Based on the optimized model and actual bus operation data, we obtained the bus corridor locations considering morning and evening peak hours in Beijing and suitable corridor traffic management modes for different road conditions. The location of the bus corridors and setting of bus lanes in the corridors are closely related to the passenger flow.

Conclusions
We developed and optimized an identification method for urban bus corridors location and transportation facilities development. Bus corridors promote social and economic activities and induce diversified land use along their path. The objective of the identification method is to minimize the cost of bus corridor identification and increase the diversification index and agglomeration effect along the path. We developed a mathematical model, while considering the constraints of the corridor development plan, intensity of land development along the path, and maximum travel time that passengers can tolerate. A K-shortest path algorithm was designed to find the candidate path set for the urban bus corridors, and the optimization model was accurately solved by using the IBM ILOG Cplex software. The following are the three major conclusions of this study: (1) Based on the optimized model and actual bus operation data, we obtained the bus corridor locations considering morning and evening peak hours in Beijing and suitable corridor traffic management modes for different road conditions. The location of the bus corridors and setting of bus lanes in the corridors are closely related to the passenger flow.
In the Beijing urban area, the passenger flow in the morning peak hours is higher, and the flow out of the city is higher in the evening peak hours. The bus corridors are mainly distributed along the directions of entering and leaving the city (such as corridors from the east to the west in the morning peak hours, from the northeast to the middle of the city, from the northwest to the southeast, and from the west to the east in the evening peak hours). Specifically, in areas with higher key node values, the demand for bus corridors is higher.
(2) From the flow point of view, the development intensity of urban corridors is different for different passenger flows. The one-way traffic on most of the roads identified as bus corridors is greater than 6671 people/h. When the passenger flow reaches 15,551 people/h, a new bus road with an independent right of way should be built.
(3) The construction of bus corridors is an effective means to alleviate urban congestion and ease urban passenger flow. In particular, for large cities with high population densities, identifying urban bus corridors can improve long-distance transportation efficiency. Combining multidimensional data, this method can be used to analyze areas with high population, main function areas, and areas having high potential travel demand in the city. Moreover, it can analyze the actual development of urban roads, to identify urban bus corridors and determine corridors where population and functions are concentrated. Furthermore, according to the actual road conditions, it can provide development plans for transportation facilities. It can provide scientific guidance for transportation and urban planning departments and facilitate collaboration between these departments. The method not only provides a basis for the initial stage of public transportation planning, but also for the adjustment of service areas and routes during operation. This method has practical significance and can be applied because it enables the development of urban public transportation corridors and setting of bus lanes.
Because it is difficult to obtain data, this study did not consider the impact of other transportation modes on bus corridors. In future research, the impact of other transportation modes, such as cars, metro, and railways, should be considered when developing bus corridors. To make the corridor settings more reasonable, the influence of the node scales on the locations of the bus corridors should also be considered, and the influence of land use and population distribution in spaces of different scales around the corridors on the location of the bus corridors should be further analyzed. Furthermore, the sensitive analysis on the weight of different factors should be studied in the future; for example, the relationship between the cost and the attraction of the corridor also needs further discussion.