A Continuous Taxi Pickup Path Recommendation under The Carbon Neutrality Context

: In the context of the carbon neutrality target, carbon reduction in the daily operation of the transportation system is more important than that in productive activities. There are few travel services that can quantify low-carbon travel, with a lack of effective low-carbon travel tools to guide transportation behavior. On-demand access to taxi services can effectively reduce the additional carbon emissions caused by cruising, which in turn increases efﬁciency in urban mobility with a reduced taxi ﬂeet scale. For individual taxis, they lack macroscopic horizon in their choice of passenger pickup paths. The selected travel path based on personal operational experience or real-time location is limited by local optimization when making path decisions. In this work, we proposed a macro-path recommendation method to assist the taxi pickup path selection to accelerate the transformation of the taxi system towards low-carbon sharing. First, an adaptive learning spatiotemporal neural network was used to predict the coarse-grained distribution of potential trips. Next, the trajectory sharing graph was constructed based on the potential trips distribution to reallocate the taxi orders for the continuous pickup path optimization. As a result, the continuous pickup path balanced the relation between travel demands and taxi supply, improving the economic and environmental beneﬁts of taxi operation and contributing to the goal of carbon neutrality. We conducted experiments on the Chengdu city ride-hailing dataset. Compared with the current status of taxi operations, the solution shows improvements in both the scale of taxi services and order gain.


Introduction
Achieving the carbon reduction target of "carbon neutrality" by 2060 poses a critical challenge to the development of various industries in China, with urban transportation emissions showing the fastest growth rate and continuous increase. The supply of urban mobility services is essential for achieving carbon neutrality in transportation, and it is necessary to design a low-carbon mobility-on-demand (MoD) service to minimize traffic carbon emissions and achieve sustainable development. Today, the on-demand access to taxi services has evolved into many effective and more convenient forms being promoted, such as Uber, Lyft, Ola, Didi, and many other ride-hailing MoD service providers, to meet the challenges of travel demands in large cities [1]. The taxi service system provides an efficient travel mode for urban transportation; however, the carbon emissions are increasing, and the operating costs are rising with the large number of taxis in supply. In terms of existing taxi operation, most taxis are cruising aimlessly on the streets looking for passengers. The vacant taxis increase the resource consumption during cruising due to the lack of information on time-varying travel demands, and the taxi dispatch system does not employ effective information to measure the economic and environmental costs of total travels.
The development goal of carbon neutrality in transportation also imposes an urgent requirement for a low carbon operation of the taxi system. For taxi drivers, cruising to find passengers improves their own orders. As for the taxi system, reducing the fleet scale of taxi operations and improving the circulation efficiency are the core aspects of low carbon transportation. However, taxi services have been in a dilemma. On the one hand, it is difficult for passengers to obtain taxi service during peak hours. On the other hand, the taxi cruising time for passenger searching has increased, resulting in traffic congestion [2]. The taxi drivers usually rely on their own operating or cruising experiences to find passengers randomly [3]. Without macro-level guidance to instruct taxis on their cruising behavior, the taxi system cannot be fully utilized [4]. The sharing mobility technologies create a new way for taxis to travel in a low-carbon mode, and these solutions can make it possible to match the taxi supply with user travel demands. While sharing mobility has huge potential to improve the global efficiency of transportation, it is insufficient to fully adopt it to hold all users. It should be integrated with other modes of transportation that have the explicit goal of optimizing carbon emission reduction [5]. Many studies related to MoD and fleet management have considered travelling without sharing rides [6].
Considering the unbalanced travel demands and the low utilization of taxi services, how to dynamically obtain the continuous passenger-carrying path is a valuable research problem. The current research on taxi routing focuses on the optimal path of a single trip at the microscopic level [7], ignoring the differences in economic and carbon emission attributes for continuous taxi paths. Some studies have analyzed taxi behavior to optimize individual operating strategies, but they rely on local optimization methods to increase pickup rates, which may lead to an imbalance in taxi supply and demand [4]. The macroscopic path strategy is also important for taxi operation. For example, by guiding the taxi pickup path with the low-carbon strategy and treating taxi resources as an MoD service, it will become an effective solution to reduce urban traffic emissions and meet travel demands.
In addition, analyzing the spatial and temporal distribution of historical trips can help taxis optimize their travel schedules. The travel flow on the road network changes with the tidal effect, and the existing path recommendation model, based on static pickup locations, cannot reflect its dynamic features. The continuous path recommendation can perceive the time-varying pattern of travel demands and make decisions according to the potential distribution, which helps the pre-scheduling of the taxi service. In terms of the relation between taxi supply and demand, the existing path recommendation models try to increase the individual order maximization, and there is competition among taxis. The competition for picking up passengers often leads to inefficient and underutilized operation for some taxis. Therefore, these models do not alleviate the unbalanced supply and demand relationship between travel services and passenger demands.
To solve these problems mentioned above, this paper combines the taxi travel path with the potential trip reallocation and builds a dynamic path recommendation model for continuous passenger pickup. This taxi service in the low-carbon context can minimize the total operating costs and emissions with trip sharing, which makes full use of the small fleet scale in the network to meet all travel demands, as shown in Figure 1.
Considering the periodicity of taxi trips and the complex features of the travel network, the recommendation model is constructed by the following steps. The real-time update and analysis of travel status using deep learning can predict the future status and support the decision making for seeking passengers. Based on the potential trip distribution obtained from the prediction, we can find a continuous pickup path to provide on-demand service through proper order reassignment. The diversity of passenger preferences is added as a heuristic factor to recommend different continuous pickup paths. This model can increase the number of taxi orders received for operating taxis and reduce the overall fleet of running vehicles. It is also hybrid, with both taxi-ride competition and cooperation being used to build an order-sharing assignment model. Based on the scale of on-demand taxis, we achieve the match between passenger travel demands and taxi service utilization. model can increase the number of taxi orders received for operating taxis and reduce the overall fleet of running vehicles. It is also hybrid, with both taxi-ride competition and cooperation being used to build an order-sharing assignment model. Based on the scale of on-demand taxis, we achieve the match between passenger travel demands and taxi service utilization.
In summary, to fill the gap of previous taxi path planning methods lacking macroscopic horizon, the study of continuous pickup path recommendation can not only meet the existing travel demands but also contribute to the full utilization of taxi services. The continuous pickup path optimizes the easiness of taking orders and the fleet scale of online operation from the spatial and temporal dimensions, which has great significance to reduce carbon emissions in the transportation field. The main contributions of this paper are as follows: (1) To overcome the sparsity of trip data, we construct a two-layer structure to reduce the relational space with the coarse-grained graph, which can better extract the implicit structure of the spatiotemporal association of trips. (2) We design a self-learning semantic relation to capture the dynamic spatial features between trips, combining that with a spatiotemporal neural network to mine the temporal patterns of travel behaviors. (3) Without changing the existing travel patterns, we propose a macrolevel path recommendation method for continuous pick-up passengers that seeks the match between passenger demands and taxi service utilization.
The beginning of this paper lists the current research status related to taxi demand prediction and path recommendation. The second part will briefly describe the basics of the taxi order sharing network. The third section details the continuous pickup path recommendation method. In the fourth part, a series of experiments and result analyses on the traffic dataset of Chengdu are given to verify the effectiveness of the proposed method. In the last part of the paper, we conclude and look forward to later research.

Related Work
Informed driving is emerging as a key feature to improve taxi sustainability, combining deep learning approaches to estimate future patterns from historical data [8]. For the prediction of taxi pickup or drop-off locations, a common approach is to divide the road network into grids and count the traffic in them, which turns the prediction problem into a multi-class classification. To alleviate origin-destination (OD) data sparsity, the grid partitioning of the road network is used as a general method [9] to explore the correlation of ODs. It combines adjacent geographic and semantic neighborhoods, with the geographic neighborhoods measuring the intrinsic closeness of grids and the semantic neighborhoods modeling the intensity of traffic between grids. Most taxi OD prediction methods have only considered the demand at the origin but have ignored the supply of taxis at the In summary, to fill the gap of previous taxi path planning methods lacking macroscopic horizon, the study of continuous pickup path recommendation can not only meet the existing travel demands but also contribute to the full utilization of taxi services. The continuous pickup path optimizes the easiness of taking orders and the fleet scale of online operation from the spatial and temporal dimensions, which has great significance to reduce carbon emissions in the transportation field. The main contributions of this paper are as follows: (1) To overcome the sparsity of trip data, we construct a two-layer structure to reduce the relational space with the coarse-grained graph, which can better extract the implicit structure of the spatiotemporal association of trips. (2) We design a self-learning semantic relation to capture the dynamic spatial features between trips, combining that with a spatiotemporal neural network to mine the temporal patterns of travel behaviors. (3) Without changing the existing travel patterns, we propose a macrolevel path recommendation method for continuous pick-up passengers that seeks the match between passenger demands and taxi service utilization.
The beginning of this paper lists the current research status related to taxi demand prediction and path recommendation. The second part will briefly describe the basics of the taxi order sharing network. The third section details the continuous pickup path recommendation method. In the fourth part, a series of experiments and result analyses on the traffic dataset of Chengdu are given to verify the effectiveness of the proposed method. In the last part of the paper, we conclude and look forward to later research.

Related Work
Informed driving is emerging as a key feature to improve taxi sustainability, combining deep learning approaches to estimate future patterns from historical data [8]. For the prediction of taxi pickup or drop-off locations, a common approach is to divide the road network into grids and count the traffic in them, which turns the prediction problem into a multi-class classification. To alleviate origin-destination (OD) data sparsity, the grid partitioning of the road network is used as a general method [9] to explore the correlation of ODs. It combines adjacent geographic and semantic neighborhoods, with the geographic neighborhoods measuring the intrinsic closeness of grids and the semantic neighborhoods modeling the intensity of traffic between grids. Most taxi OD prediction methods have only considered the demand at the origin but have ignored the supply of taxis at the arrival destination. The prediction of interactive demands usually requires the fusion of more semantic information to improve the prediction ability, such as the Contextualized Spatial-Temporal Network (CSTN) [10] and the Spatiotemporal Residual Neural Network (ST-ResNet) [11]. The CSTN is used to predict the future demand for travel interactions between regional pairs. Three independent components were constructed to extract correlations, which well integrates the local spatial context, temporal evolution context, and global correlation context into a unified framework. The ST-ResNet predicts user inflows and outflows for each city region based on temporal closeness, period, and trend features. By aggregating the outputs of multiple feature networks and assigning varying weights to different branches, the ability to predict interactive demands is further improved. For the irregularly arranged regions, community discovery is used to construct spatiotemporal networks to explore the semantic relationship between regions and to predict travel demands at the multi-regional level [12]. However, the multiclassification prediction models based on road network partition cannot learn some new locations. A feasible approach is to use geographic information from location-based social networks (LBSNs) to model taxi travel behavior and encode the semantics of the visited locations [13]. The coordinates of the destination are predicted by the functions that directly approximate the latitude and longitude to improve the precise location of the next drop-off point. To reduce the amount of feature engineering and of external data required to build deep learning models, some studies adopt multiple networks or reinforcement learning to enhance the predictive performance. For example, the CNN can be used to extract spatial features and the LSTM to extract temporal features, which are described by the embedding layer [14]. This architecture can be easily extended to other traffic prediction problems such as road traffic and flow prediction. The Multi-Intelligent Reinforcement Learning (MARL) based taxi dispatching model can balance the supply and demand of taxis in different regions [4]. Unlike the taxi scheduling method based on real-time location, it first predicts the demands in different areas for the next time period and then dispatches taxis in advance.
Due to the time-varying travel demands, the taxi service may generate a large number of vacant vehicles when operating in different regions. The vacant taxis operating on the road network not only waste travel resources but also burden the city with emission and traffic. To improve the utilization of vacant vehicles, the location of taxi parking can be accurately detected from the trajectories. These parking locations signify where the taxis stand waiting for passengers, and the probabilistic model can be constructed based on trajectories to describe the taxi dynamic behaviors, thus providing some real-time locations to pick up passengers quickly [15]. The other method is to predict crowd flows by discovering patterns of passenger pick-up quantity (PUQ) in urban hotspots, and the Auto-Regressive Integrated Moving Average (ARIMA) method can be used to predict the spatial and temporal variation of passengers in hotspots to help drivers find their next passenger [16]. As travel demands change dynamically over time, static solutions cannot adapt to the evolving scenario. The dynamic future demand-aware vehicle scheduling system [1] can dynamically search for vacant vehicle resources by considering both travel demand and traffic status. In addition, the MoD system's historical travel data can be used to optimize vehicle distribution and fleet size. The MoD works by guiding the trips of the vehicles to meet all travel demands, allowing the one vehicle to serve multiple passengers [17]. It provides guidance on how many vehicle resources should be allocated to meet the demands of a given region. In the automated MoD system [18], passengers can share a group of self-driving vehicles. It also regularly adjusts the match between the supply of vehicles and travel demands, considering the sharing of travel resources from a system perspective.
In terms of path recommendation for vacant taxi cruising, there are no valid criteria for optimal routing evaluation. Some dynamic path planning methods that use historical data are used to improve the performance of the self-driving taxi network. The path planning assigns orders to taxis in a pre-assigned manner, minimizing the expected cost of current and future travel demands. It consists of three main steps, which are pruning travel trips, assigning vehicle order, and rebalancing vacant vehicles, so that the probability distribution of future demand is decoupled into the vehicle routing and order assignment [19]. In [20], a more general high-capacity ride-sharing model can dynamically generate optimal routes based on online demand and vehicle locations. It can effectively return a trip request that solves both the problem of assigning vehicles to passengers and rebalancing the fleet to meet travel demands. In the study of using taxi trajectories to reduce the cruising distance, an adaptive shortest expected cruising route (ASER) was proposed. The ASER builds a probabilistic network using Kalman filtering to predict the pickup probability and capacity of each location and then recommends the travel path to the taxi driver [21]. It also considers the load balance between passengers and taxis. Further, the multi-criteria path recommendation models that integrate real-time spatiotemporal prediction and traffic network status focus on optimizing the next passenger-carrying path. The travel demand prediction can estimate the probability of passenger pickup and drop-off, capturing the flows of potential passengers. The heuristic function J* algorithm [22] was proposed based on the distribution of the prediction module, combining the pickup probability, drop-off distribution, road network, distance, and time factors. To reduce the aimless cruising of taxis, some studies have fully analyzed the relation between supply and demand, such as the attraction between taxis and passengers and the competition between taxis. In [3], the traffic force for cruising taxis is calculated by collecting the density information of passengers and taxis from the trajectories. According to the corresponding traffic force, the taxis are assigned to the optimal road segments to pick up desired passengers. Dynamic taxi path recommendation aims to recommend cruising routes to vacant vehicles. Most of them focus on building probabilistic models, such as the Mixed Path Size Logit (MPSL) model [23], which analyzes taxi behaviors through spatiotemporal features with passenger generation rate, path travel time, cumulative intersection delay, path distance, and path size. With the development of deep learning recommendation models, dynamic taxi path recommendation is studied as a sequential decision-making problem. By extracting multiple spatiotemporal features related to the easiness degree of vacant taxis picking up passengers, an adaptive deep learning method was designed to achieve effective path recommendation [24].
As mentioned above, the travel status prediction in conjunction with the path recommendation system based on the trip sharing pool has the opportunity to solve the existing inefficiencies in taxi mobility, thus reducing the resource waste and carbon emission of transportation.

Taxi Trajectory Sharing Network
The taxi trajectory sharing network connects OD trajectories in a certain time period, thus enabling the same vehicle to pick up as many passengers as possible for the OD travel trips. The travel trajectories are modeled in a static way in time period T by introducing the maximum number of splices k that can be shared and the maximum waiting delay δ for connecting trips. Let Tra(T) = (o m , d m , t o m , t d m ) denote the total number of trips in T, and m denotes the trip number. o m and d m are the corresponding origin and destination of it, and t o m , t d m ∈ T are the starting and ending time, respectively. In Tra(T), it is assumed that there exist fewer paths that can connect all OD's trajectories and satisfy the following spatiotemporal constraints. The OD trajectories are spliced in temporal sequence, and the trip time cannot overlap but can be linked by δ time delay. The o m is located before the corresponding d m , and the maximum number of connections is k. Thus, the Tra(T) can be shareable for any vehicle.
The interval δ connecting two trips has a direct impact on the generated topology graph of sharing network [25]. It is assumed that two consecutive Tra(i) and Tra(j) are served by the same vehicle, i,j ∈ [1,m] and i = j, and the time required to connect them is δ ij = t o j − t d i . If δ ij is very short, most of the trips cannot be linked. On the contrary, if the time is long, it leads to an inefficient taxi service operation, with a lot of emissions and idling time being spent waiting for trip links. For k = 2, the two trips can be shared at a given δ by placing a link between Tra(i) and Tra(j). For k > 2, the sharing network becomes a hypergraph in which most k trajectories can be linked simultaneously. The maximum match for the trajectory sharing network is defined as follows: Definition 1. (Trajectory Sharing Network). The historical travel trajectories of all taxis carrying passengers are mapped into the road network to form a directed acyclic graph G = (V, E), where V represents the pickup or drop-off locations of the trips and E represents the connections. The sharing network assumes that there is no competition between taxis and each individual taxi can serve as many orders as possible, and we consider G as a trajectory sharing network for passenger pickup. [26]. Given a trajectory sharing network G, the largest matching M in G is a pair of disjoint edges. The maximum matching contains as many edges as possible to minimize the number of M that satisfies all orders to be served with no conflicting time.

Definition 2. (Maximum Match for Sharing Network)
In the trajectory sharing network, the maximum match optimizes the fleet scale of valid taxi operations. According to the taxi order schedule, if it satisfies the temporal constraint between the drop-off point in the current trip and the pickup point in the next trip, the two trip trajectories can be spliced, as shown in Figure 2. By iteratively calculating until all trips are spliced, this allows the orders to be served with fewer vehicle assignments. In addition, the connecting time between trips must not exceed the upper limit δ. The maximum match generates a path set that covers the entire G, ensuring that all orders are served while minimizing the fleet scale of vehicles in this solution. This is also the optimal solution to the minimum fleet problem with parameter δ. In this paper, we use the matched M paths as the recommended paths for taxis to carry passengers continuously.
If ij δ is very short, most of the trips cannot be linked. On the contrary, if the time is long, it leads to an inefficient taxi service operation, with a lot of emissions and idling time being spent waiting for trip links. For k = 2, the two trips can be shared at a given δ by placing a link between Tra(i) and Tra(j). For k > 2, the sharing network becomes a hypergraph in which most k trajectories can be linked simultaneously. The maximum match for the trajectory sharing network is defined as follows:

Definition 1. (Trajectory Sharing Network). The historical travel trajectories of all taxis carrying passengers are mapped into the road network to form a directed acyclic graph G = (V, E), where V represents the pickup or drop-off locations of the trips and E represents the connections.
The sharing network assumes that there is no competition between taxis and each individual taxi can serve as many orders as possible, and we consider G as a trajectory sharing network for passenger pickup. [26]. Given a trajectory sharing network G, the largest matching M in G is a pair of disjoint edges. The maximum matching contains as many edges as possible to minimize the number of M that satisfies all orders to be served with no conflicting time.

Definition 2. (Maximum Match for Sharing Network)
In the trajectory sharing network, the maximum match optimizes the fleet scale of valid taxi operations. According to the taxi order schedule, if it satisfies the temporal constraint between the drop-off point in the current trip and the pickup point in the next trip, the two trip trajectories can be spliced, as shown in Figure 2. By iteratively calculating until all trips are spliced, this allows the orders to be served with fewer vehicle assignments. In addition, the connecting time between trips must not exceed the upper limit δ. The maximum match generates a path set that covers the entire G, ensuring that all orders are served while minimizing the fleet scale of vehicles in this solution. This is also the optimal solution to the minimum fleet problem with parameter δ. In this paper, we use the matched M paths as the recommended paths for taxis to carry passengers continuously.

Problem Definition for Continuous Taxi Pickup Path Recommendation
It is necessary to distinguish that continuous passenger pickup paths are not focused on the path planning problem of one-way trip. During the course of the current taxi trip, the recommendation model advises the passenger location for the next order according to the arrival time and destination. The continuous taxi pickup path recommendation problem is mainly to help the taxi driver make the next passenger carrying decisions, and it (a) Travel trips (b) Order-sharing network (c) Optimal orders assignment

Problem Definition for Continuous Taxi Pickup Path Recommendation
It is necessary to distinguish that continuous passenger pickup paths are not focused on the path planning problem of one-way trip. During the course of the current taxi trip, the recommendation model advises the passenger location for the next order according to the arrival time and destination. The continuous taxi pickup path recommendation problem is mainly to help the taxi driver make the next passenger carrying decisions, and it can effectively improve the validity of the trip and benefit the global operation efficiency. Considering the inconsistent start and end time of each trip, we coarsen the trip data into an equal time series according to δ and mine the time pattern of trips. Let U = {u 1 , u 2 , . . . , u P } denote the unique identifier of the taxi, and P is the fleet scale of actual taxis. The E = {e 1 , e 2 , . . . , e L } denotes the unique identifier of the order trajectories, e is the trajectory between the one-way OD, and L is the total number of orders. The C = {c 1 , c 2 , . . . , c M } denotes the splicing trajectories in the sharing network, and M is the maximum number of matching paths, M ≤ P ≤ L. Each e is associated with the pickup point o and the drop-off point d of the trip and belongs to part of the spliced trajectory c.

Definition 3. (Taxi Pickup Trajectory).
In the temporal snapshot T of the sharing network, the pickup trajectory of taxi i is defined as a four-tuple (u i , e i , c i , t o i , t d i ), where u i ∈ U denotes the taxi identification, e i ∈ E denotes the current trip, c i ∈ C denotes the continuous pickup path to which it belongs, and t o i , t d i ∈ T denotes the start and end time of the trip. The tuple represents the continuous pickup trajectory obtained by splicing the current trip.
The continuous pickup path recommendation is based on the maximum matching obtained from the trajectory sharing network under the current trip e and temporal constraint (t o , t d ) to obtain the Top-n paths. We turned the continuous taxi pickup path recommendation problem into the generation of the trajectory sharing network and found the maximum match path of it. Specifically, it was divided into two stages.
Based on the OD trips of historical orders, it is possible to estimate the potential orders for the future. We built a spatiotemporal neural network to predict the temporal pattern of orders. The topology of the trajectory sharing network G is represented by the pickup and drop-off points V, the set of trip trajectories E, and the adjacency matrix A. We used discrete time points t ∈ T to record the orders, and X T denotes the travel volume. The x t ∈ X T denotes the status of an OD pair in the sharing network at time t. The prediction of taxi orders can be viewed as learning the mapping function f based on the vector X T in G, as shown in Equation (1): where τ denotes the historical time window, and s denotes the prediction step. The next stage was to construct a trajectory sharing network based on potential trips that satisfied the constraint δ and k to obtain the maximum match as the recommendation path. A taxi's continuous pickup path c(u i ) is defined as a trajectory of orders connected in time sequence, with time intervals between them less than threshold δ, denoted as and k is the count of hops of this continuous pickup path. To enhance the continuous passenger carrying capability of taxis, this paper was devoted to learning a path recommendation function Recommend{.} that maps the predicted travel trips to the sharing network for path computation. Given the order information x t~xt+s of u i and the arrival time t d i of current trip e i , the continuous pickup path recommendation returns a set of n paths, as shown in Equation (2).
From this, the c 1 (u i ) is the optimal continuous pickup path for taxi u i in the next order, c 2 (u i ) is the suboptimal solution, and so on. In fact, the potential trajectory sharing network is able to completely define the time-varying features of the taxi trips. Therefore, the continuous pickup path recommendation model extracts the historical features of the trips and calculates the travel paths based on them.

Continuous Taxi Pickup (CTPU) Path Recommendation Architecture
The CTPURec is a discrete dynamic path recommendation model that builds a continuous path based on the prediction of future travel demands. As the path recommendation is constructed based on a sharing network of potential orders, it first requires an accurate prediction of travel trips. We use the offline phase to train the historical taxi trip data, then the online phase to predict potential trips in real-time and generate recommended paths. The CTPURec defines and trains taxi demands and sharing network modules separately, and it interacts to capture the complex relationships between taxi trajectories and passenger pickup paths. The architecture of CTPURec can be subdivided into the spatiotemporal neural network for travel trip predictions and the trajectory sharing network for collaborative path recommendations, as shown in Figure 3. The spatiotemporal neural network generates a snapshot graph of the potential travel orders, which further extracts the ideal passenger pickup paths in the trajectory sharing network. This hybrid structure builds two levels of decision support, with system-level decisions guided by the minimum fleet size to satisfy the travel demands and user-level decisions to guide the next taxi trip selection.
tinuous path based on the prediction of future travel demands. As the path recommendation is constructed based on a sharing network of potential orders, it first requires an accurate prediction of travel trips. We use the offline phase to train the historical taxi trip data, then the online phase to predict potential trips in real-time and generate recommended paths. The CTPURec defines and trains taxi demands and sharing network modules separately, and it interacts to capture the complex relationships between taxi trajectories and passenger pickup paths. The architecture of CTPURec can be subdivided into the spatiotemporal neural network for travel trip predictions and the trajectory sharing network for collaborative path recommendations, as shown in Figure 3. The spatiotemporal neural network generates a snapshot graph of the potential travel orders, which further extracts the ideal passenger pickup paths in the trajectory sharing network. This hybrid structure builds two levels of decision support, with system-level decisions guided by the minimum fleet size to satisfy the travel demands and user-level decisions to guide the next taxi trip selection.

Travel Trip Predictions Based on Self-Learning Semantic Relation
Travel demands are the main drivers for taxi operation, and accurate prediction of trips helps to capture taxi movement patterns. The road network-based OD matrix is a large dimensional sparse matrix that records the links between all nodes (intersections). Due to the sparsity and fine-grained scale of the data, node-level OD prediction is often difficult to extract effective features, and the learned features are often unreliable. The order-level ODs describe the travel demands of passengers in a coarse-grained manner. By extracting these order pickup and drop-off points, the taxi's trajectory graph can be well characterized. The coarse-grained sharing network makes the ODs as nodes to construct a directed graph, reducing the node space for graph search. The aggregated network formed by removing the middle nodes of the trips is a coarse-grained representation of the road network.
The graph convolutional network is widely used for spatial feature extraction of temporal snapshots of traffic network [27]. The Laplace matrix L = D-A can be used to represent the structure of the graph, with the L normalized form where A, D denote the adjacency matrix and degree matrix. However, there are some limitations in using fixed adjacency relations to characterize dynamic graph features [28]. To better capture the semantic relation between orders, we extend the adjacency matrix A of the static topology to a dynamic OD semantic correlation matrix Ã. The Ã is composed of the OD

Travel Trip Predictions Based on Self-Learning Semantic Relation
Travel demands are the main drivers for taxi operation, and accurate prediction of trips helps to capture taxi movement patterns. The road network-based OD matrix is a large dimensional sparse matrix that records the links between all nodes (intersections). Due to the sparsity and fine-grained scale of the data, node-level OD prediction is often difficult to extract effective features, and the learned features are often unreliable. The order-level ODs describe the travel demands of passengers in a coarse-grained manner. By extracting these order pickup and drop-off points, the taxi's trajectory graph can be well characterized. The coarse-grained sharing network makes the ODs as nodes to construct a directed graph, reducing the node space for graph search. The aggregated network formed by removing the middle nodes of the trips is a coarse-grained representation of the road network.
The graph convolutional network is widely used for spatial feature extraction of temporal snapshots of traffic network [27]. The Laplace matrix L = D-A can be used to represent the structure of the graph, with the L normalized form L = D − 1 2 LD − 1 2 , where A, D denote the adjacency matrix and degree matrix. However, there are some limitations in using fixed adjacency relations to characterize dynamic graph features [28]. To better capture the semantic relation between orders, we extend the adjacency matrix A of the static topology to a dynamic OD semantic correlation matrix Ã. The Ã is composed of the OD semantic relations of nodes within a snapshot of time T. Combining the semantic relations of self-learning [29], we employ an adaptive correlation matrix and capture the interaction between node status and link relations by end-to-end supervised training. The adaptive correlation matrix can capture the hidden spatial dependencies in the data as shown in Equation (3).
The A is obtained by two learnable parameters E o and E d , where E o is the origin embedding and E d is the destination embedding. Then, the Relu activation function is used to eliminate weak connections, and finally we obtain the Ã by using the Softmax. During the building process of A, the OD trajectory is considered as a node, and the interaction between them is symmetric and can be directly used for graph analysis. The state transition in the sharing network can be regarded as an aggregation function of the semantic node status, that is, the semantic features are the projections of the multidimensional features to first order, which is achieved by Ã. The first-order Chebyshev approximation to graph convolution [30] further simplifies the graph convolution operation. We added the semantic correlation matrix Ã and its dependency degree matrix D to define the GCN layer as follows: where D − 1 2 A D − 1 2 is the convolution filter for computing the semantic correlation, X denotes the node status, Θ is the training parameter matrix, and Γ (GCN) is the ReLU activation function for first-order spectral convolution. Finally, the node status X T is transformed into an implicit graph encoding H T through the encoding H of the sharing network. The CTPURec extracts the semantic features of the graph snapshot, then the output is fed into the temporal module to capture the time-dependent features.
In the potential travel trip prediction, the change of node status is the result of a combination of spatial propagation and temporal tidal effects. The baseline LSTM can predict future multi-step outputs based on historical sequences [31]. The prediction target is to fit the graph encoding h t+1 based on historical encoding {H q | q = t−τ, . . . , t}, where τ is the time interval. It takes H q as the input, and the output layer, after processing performed by the internal memory unit, will generate a value h t+1 as the input of the next cell. Combined with the gate control units of LSTM, we propose a coarse-grained graph prediction method. The memory unit processes the vector H q through the forget gate, input gate, and output gate, respectively. As a result, the x t is converted to F t , I t , O t of hidden layers via the following gate update function as shown in Equation (5): where W (x) denotes the weight matrix of the input layer to the gate, and b (x) denotes the corresponding bias. The Γ (gate) is the Sigmod activation function. The latent encoding of gate control units is computed using the input and the historical hidden state within time t−1. The input gate and forget gate are used to update the cell state C t , and the output h t is controlled by the output gate and the cell activation state, as in Equation (6): where is the Hadamard product and C t is the cell state with the corresponding weight W c and bias b c . In summary, based on the sharing network statuses at time t, the CTPURec spatiotemporal memory unit utilizes A t−1 , C t−1 , and h t−1 as the a priori knowledge for prediction. The final prediction y t is obtained by the output h t of the spatiotemporal memory unit. This prediction result effectively integrates the semantic knowledge of historical status and dynamics, and it better fits the travel patterns of the real road network.

Continuous Passenger Pick-Up Path Recommendation
Traditional taxi path recommendation models focus on the path problem of a one-way trip and the lack of guidance on the macroscopic travel path. The trajectory sharing network covers the temporal trajectories of taxi orders, so a pre-assignment approach can be adapted to constrain the behavior of taxis for the purpose of global optimization. With the purpose of a balanced orders assignment, we propose a macrolevel path recommendation method based on the sharing network. By predicting the potential orders, we create an ordersharing graph snapshot based on the future taxi orders for the next three hours. The sharing network evolves different topology graphs with the snapshots, which obtain a series of dynamic trip distributions. The generation of the potential trip distributions will determine the macro planning of the taxi's pickup path and the system-level order allocation strategy, and further, it can even be used to build a global travel equilibrium system.
The solution to the maximum path coverage of a directed graph is an NP-hard problem, but the optimal solution can be found in polynomial time if there are no closed-loop paths in it. There is no closed loop directed path in the trajectory sharing network, see Proof 1. Therefore, we transform the path extension problem for a sharing network into a maximum matching solution for a bipartite graph. First, the nodes set V of the sharing graph G is divided into two parts, V o and V d , V o ∪ V d = V, and the edges in E are connected for it. The bipartite graph is formulated as G = (V o ∪V d , E). For the matching of G , the set of trajectories M is used to connect the path from V o to V d , and the path is covered by, at most, one trajectory. The maximum matching problem of the order-sharing graph is thus transformed into the minimum number |M| of augmenting paths covering all nodes, which is also the maximum matching of G . For a given taxi order, we augment this order by path splicing until all trips are traversed. Finally, we obtain the fully covered connected branch of G by the trajectorysplicing process. The recommended path of continuous pickup is the next-hop trajectory in the augmentation path. In practice, the taxi drivers have different tolerances for order delays, and it is difficult to increase their individual costs to meet the global optimum. The CTPURec defines a delay relaxation parameter δ to generate different augmented paths, which are ranked according to the travel cost to obtain the Top-n recommendation. When an OD request is made, the personalized δ is added for path splicing, so that different output paths are possible for the same request of different taxi drivers, as shown in Figure 4. the purpose of a balanced orders assignment, we propose a macrolevel path recommendation method based on the sharing network. By predicting the potential orders, we create an order-sharing graph snapshot based on the future taxi orders for the next three hours. The sharing network evolves different topology graphs with the snapshots, which obtain a series of dynamic trip distributions. The generation of the potential trip distributions will determine the macro planning of the taxi's pickup path and the system-level order allocation strategy, and further, it can even be used to build a global travel equilibrium system.
The solution to the maximum path coverage of a directed graph is an NP-hard problem, but the optimal solution can be found in polynomial time if there are no closed-loop paths in it. There is no closed loop directed path in the trajectory sharing network, see Proof 1. Therefore, we transform the path extension problem for a sharing network into a maximum matching solution for a bipartite graph. First, the nodes set V of the sharing graph G is divided into two parts, Vo and Vd, Vo ∪ Vd = V, and the edges in E are connected for it. The bipartite graph is formulated as G′ = (Vo∪Vd, E). For the matching of G′, the set of trajectories M is used to connect the path from Vo to Vd, and the path is covered by, at most, one trajectory. The maximum matching problem of the order-sharing graph is thus transformed into the minimum number |M| of augmenting paths covering all nodes, which is also the maximum matching of G.'

Definition 4. (Trajectory Splicing). Let M be a matching of the bipartite graph G′, and E is the next-hop trip that satisfies the time constraint. If M′ = M⊕E is still a match, and |M′| = |M|+1, then E is a valid extension. The ⊕ is the symmetric difference operation.
For a given taxi order, we augment this order by path splicing until all trips are traversed. Finally, we obtain the fully covered connected branch of G′ by the trajectory-splicing process. The recommended path of continuous pickup is the next-hop trajectory in the augmentation path. In practice, the taxi drivers have different tolerances for order delays, and it is difficult to increase their individual costs to meet the global optimum. The CTPURec defines a delay relaxation parameter δ′ to generate different augmented paths, which are ranked according to the travel cost to obtain the Top-n recommendation. When an OD request is made, the personalized δ′ is added for path splicing, so that different output paths are possible for the same request of different taxi drivers, as shown in Figure  4. At the same time, the prediction module transmits the potential orders to the online recommendation module in almost real-time. In this way, the online recommendation module is able to make continuous pickup path decisions that satisfy the taxi travel demands based on the trip distribution. The target of CTPURec is to obtain a potential order assignment scheme for the taxi service system, which converts taxi ridership competition into an on-demand assignment.
As for trip e 1 , the t o 1 < t d 1 contradicts the temporal constraint t d 1 < t o 1 in the loop, so the trajectory sharing network is a directed acyclic graph.

CTPURec Optimization
The recommendation of the continuous pickup path relies on the generation of the order-sharing graph, and we obtain it by training the historical data through a neural network layer. The recommended path is based on the trajectory splicing on the potential trips. Thus, the taxi service system operation can be transformed into an order-balanced assignment for on-demand acquisition by training a sharing graph for path recommendation. We propose a data-driven deep learning solution for obtaining order-sharing graph sequences, as shown in Figure 5. . Therefore, the loop path is transformed into a sequence of trajectories in the loop, so the trajectory sharing network is a directed acyclic graph. □

CTPURec Optimization
The recommendation of the continuous pickup path relies on the generation of the order-sharing graph, and we obtain it by training the historical data through a neural network layer. The recommended path is based on the trajectory splicing on the potential trips. Thus, the taxi service system operation can be transformed into an order-balanced assignment for on-demand acquisition by training a sharing graph for path recommendation. We propose a data-driven deep learning solution for obtaining order-sharing graph sequences, as shown in Figure 5. In the offline training phase, the historical taxi travel trip data provides input samples for network training. The spatiotemporal feature extraction first aggregates the data of the trips to obtain a coarse-grained sharing graph, and then it learns the sequence features of the graph based on temporal snapshots. The demand and supply of taxi services are complex and changing, so the links in the sharing network should also be dynamic. In particular, there are limitations in using fixed adjacent relationships to model spatial dependencies between trips. Therefore, we employ an adaptive correlation matrix to capture the hidden spatial dependencies between trip and topology. After offline training, the prediction module can perform real-time prediction for potential travel trips and periodically update the parameter configuration. With the obtained potential trips, they are spliced by constructing a trajectory sharing network to map the trips to the optimal (or near-optimal) branch of the maximum match. To train the learning parameters of the model, we use the  In the offline training phase, the historical taxi travel trip data provides input samples for network training. The spatiotemporal feature extraction first aggregates the data of the trips to obtain a coarse-grained sharing graph, and then it learns the sequence features of the graph based on temporal snapshots. The demand and supply of taxi services are complex and changing, so the links in the sharing network should also be dynamic. In particular, there are limitations in using fixed adjacent relationships to model spatial dependencies between trips. Therefore, we employ an adaptive correlation matrix to capture the hidden spatial dependencies between trip and topology. After offline training, the prediction module can perform real-time prediction for potential travel trips and periodically update the parameter configuration. With the obtained potential trips, they are spliced by constructing a trajectory sharing network to map the trips to the optimal (or near-optimal) branch of the maximum match. To train the learning parameters of the model, we use the average error between the real and predicted flows in the sharing graph as the objective function for the training process, as shown in Equation (7): where x denotes the trip volume distribution in the real sharing graph, and y denotes the distribution of predicted trips. The second term 2 is a ridge regularization, and γ is a hyperparameter. After the pre-training procedure of the sharing network, the distribution of trips and the graph connectivity are obtained for future steps. Based on these predictions, the results are fed into the online recommendation module for real-time pickup path recommendations. Online Path Recommendation: As the volume of passenger trips changes dynamically over time, the CTPURec needs to constantly update the trip distribution in the sharing graph to adapt to its evolution. The output y of the online prediction module indicates the potential trips of the passenger travel in the sharing network, interpreted as a coarsegrained travel trajectory. The CTPURec generates maximum matching paths based on the topological configuration of the trips, guiding taxis to adopt the matching paths for passenger pickup travel. The path recommendation is to perform path search on the sharing graph after maximum matching and return the pickup path that meets the travel demands and is consistent with the global optimization. Therefore, the taxi pickup path recommendation model, with consideration of order equilibrium, can effectively guide the taxi driver's decision-making for the next hop.

Experimental Results and Analysis
In this work, we focused on a more meaningful and logical task of making continuous pickup paths to taxis in the next three hours. In the large-scale road network, if node-level data granularity is used for analysis, the sparsity of order data leads to poor performance. We performed data fusion on taxi trips data based on a sharing graph. The trip data was segmented and fused under the spatiotemporal slice, and then the sharing graph was obtained as the input to our model.
Data aggregation for sharing graph: Since the nodes covered in the trips were very sparse compared to the road network, it was difficult for node-level trip prediction to accurately capture the travel demand pattern. We built a sharing graph based on the node graph of road network to cope with the sparsity of the trip data. To calculate the link cost between orders in the sharing graph, it was necessary to bridge the sharing graph with the road network. The two-layer structure preserved order information and connection relationships, thus reducing the search space. The prediction was performed on the upper layer to overcome data sparsity. As shown in Figure 6, the data density at the order level was higher than that at the road network level. The structural complexity and link sparsity of the road network was significantly reduced after coarse-grained aggregation.
cally over time, the CTPURec needs to constantly update the trip distribution in the sharing graph to adapt to its evolution. The output y of the online prediction module indicates the potential trips of the passenger travel in the sharing network, interpreted as a coarsegrained travel trajectory. The CTPURec generates maximum matching paths based on the topological configuration of the trips, guiding taxis to adopt the matching paths for passenger pickup travel. The path recommendation is to perform path search on the sharing graph after maximum matching and return the pickup path that meets the travel demands and is consistent with the global optimization. Therefore, the taxi pickup path recommendation model, with consideration of order equilibrium, can effectively guide the taxi driver's decision-making for the next hop.

Experimental Results and Analysis
In this work, we focused on a more meaningful and logical task of making continuous pickup paths to taxis in the next three hours. In the large-scale road network, if node-level data granularity is used for analysis, the sparsity of order data leads to poor performance. We performed data fusion on taxi trips data based on a sharing graph. The trip data was segmented and fused under the spatiotemporal slice, and then the sharing graph was obtained as the input to our model.
Data aggregation for sharing graph: Since the nodes covered in the trips were very sparse compared to the road network, it was difficult for node-level trip prediction to accurately capture the travel demand pattern. We built a sharing graph based on the node graph of road network to cope with the sparsity of the trip data. To calculate the link cost between orders in the sharing graph, it was necessary to bridge the sharing graph with the road network. The two-layer structure preserved order information and connection relationships, thus reducing the search space. The prediction was performed on the upper layer to overcome data sparsity. As shown in Figure 6, the data density at the order level was higher than that at the road network level. The structural complexity and link sparsity of the road network was significantly reduced after coarse-grained aggregation. The road network for the experiments in this paper was derived from the map of Chengdu city provided by OpenStreetMap (www.openstreetmap.org by 7 April 2021), by further extracting 113,825 road sections and 81,371 nodes. The taxi trip data was obtained from the KDD CUP 2020 car-hailing datasets provided by Didi Chuxing (gaia.didichuxing.com by 7 April 2021). It contains taxi trajectories and car-hailing orders for Chengdu, where the travel data ranged from 1 November to 30 November 2016. The total number of order data after missing value processing was 7,062,959 trip records, and each trip record contained the start and end timestamps and the geographic coordinates of the origin and destination locations. The number of coarse-grained nodes after the aggregation of location points for taxi trips was 36,353. Finally, the taxi OD matrix in a fixed time interval was obtained by counting the number of interactions between nodes in the sharing graph. In the following, we will evaluate the effectiveness of the model from four aspects.
The experimental environment was (CPU: Intel(R) Xeon Silver 4210CPU@2.20GHz, GPU: NVIDIA GeForce RTX 2080 Ti, Provided by OMNISKY of Beijing, China), 64GB RAM, Windows 10 64-bit operating system. The experiments used the Python to map and match the road network dataset, and then build the TensorFlow deep learning model optimized by the Adam optimizer to perform order prediction and trajectory analysis. The latent feature dimension of temporal gating and spatial convolution was set to 64 and the learning rate was 10 −3 .

Potential Trips Sharing Network Distribution Prediction
The CTPURec divided the taxis acquired on-demand in the road network into two parts, the active set and the idle set, based on the potential sharing network prediction. The training process was to add the status sequence of orders to the road network, and the time-divisional sharing network was used as a time snapshot. We set a historical time window of 12 h and aimed to predict potential orders for the next 3 h. We used the first 80% of the travel dataset as the training data and the remaining 20% as the validation data. We chose three prediction errors as evaluation metrics, MAE, RMSE, and accuracy. The smaller the value of MAE and RMSE, the better the performance of the model. The value of accuracy ranged from 0 to 1, and the closer it was to 1, the better fitting ability of the model. The experiments compared the performance of our proposed CTPURec with other baselines, such as recurrent neural networks with LSTM [31], GRU [32], and graph neural networks with GCN [33], T-GCN [34], and GC-LSTM [35]. We trained the models separately for 500 epochs, and the prediction errors on the validation data are shown in Table 1. As can be seen from the prediction errors at the three-time steps, the GCN had the worst prediction reliability because it only relied on the aggregated neighborhood features.
The T-GCN combined the advantages of GCN and GRU, and there was no significant improvement on the results due to the graph feature extraction with fixed topological correlation. The GRU and LSTM had better prediction performance in timing-dependent trip sharing networks by introducing the gating mechanism. Meanwhile, both LSTM-based CTPURec and GC-LSTM used dynamic semantic vectors for spatial feature extraction to obtain lower errors and higher accuracy. Although the prediction performance decreased with an increasing time length, the CTPURec improved 32.33%, 25.56%, and 8.33% relative to the suboptimal solution at the 3 h prediction. This was due to the spatiotemporal coding capturing more information about potential trends, which made it more suitable for long time prediction.
The purpose of this set of experiments was to examine the feature fitting ability of the CTPURec and the baselines during the iterative process. Since the CTPURec was based on spatiotemporal feature fusion, we focused on its feature extraction ability during the training process. The Figure 7a,b shows the variation of MAE and RMSE of the models at the 3-h prediction steps. It is worth noting that the training errors of GCN, GC-LSTM, and SANN were relatively small and reached stability quickly in the initial stage of model training, indicating that the learning rate of graph features was better than that of the time-series recurrent networks. As can be seen from the change of accuracy curve in Figure 7c, the CTPURec achieved better accuracy in the initial stage, which means that the adaptive semantic relations could fit the spatial relations of trips well. the 3-h prediction steps. It is worth noting that the training errors of GCN, GC-LSTM, and SANN were relatively small and reached stability quickly in the initial stage of model training, indicating that the learning rate of graph features was better than that of the timeseries recurrent networks. As can be seen from the change of accuracy curve in Figure 7c, the CTPURec achieved better accuracy in the initial stage, which means that the adaptive semantic relations could fit the spatial relations of trips well. To clearly show the ability of different models to predict potential orders, we intercepted the change in overall order volume for the next 16 h prediction. As shown in Figure  8, we compared the TGCN, GC-LSTM, LSTM, GRU, and CTPURec, with different colors indicating different order magnitudes. In Figure 8b, the CTPURec uses adaptive spatial relations for prediction, and the fluctuation pattern of its order volume best matched the Real plot. The GC-LSTM also used dynamic neighborhood modeling, however, its nonlinear fitting ability was not as sufficient as that of the adaptive CTPURec. In outlier processing, LSTM and TGCN were more sensitive to sharply rising order quantity in the figure, while CTPURec generalized these orders. In Figure 8e, the fitted curve of TGCN fluctuates a lot. This is because the superposition of redundant spatial features increased the magnitude of the predicted values, resulting in an unstable curve. The GRU and LSTM are recurrent neural networks that performed predictions based on time series features. It can be seen that their performance was relatively stable, with no dramatic fluctuations in the prediction curves, but the fitting ability was not as good as that of the dynamic neighborhood model. As the GC-LSTM required constant dynamic adjustment of neighbor relationships, the spatial feature fitting ability was better than TGCN, but not as good as the CTPURec. To clearly show the ability of different models to predict potential orders, we intercepted the change in overall order volume for the next 16 h prediction. As shown in Figure 8, we compared the TGCN, GC-LSTM, LSTM, GRU, and CTPURec, with different colors indicating different order magnitudes. In Figure 8b, the CTPURec uses adaptive spatial relations for prediction, and the fluctuation pattern of its order volume best matched the Real plot. The GC-LSTM also used dynamic neighborhood modeling, however, its nonlinear fitting ability was not as sufficient as that of the adaptive CTPURec. In outlier processing, LSTM and TGCN were more sensitive to sharply rising order quantity in the figure, while CTPURec generalized these orders. In Figure 8e, the fitted curve of TGCN fluctuates a lot. This is because the superposition of redundant spatial features increased the magnitude of the predicted values, resulting in an unstable curve. The GRU and LSTM are recurrent neural networks that performed predictions based on time series features. It can be seen that their performance was relatively stable, with no dramatic fluctuations in the prediction curves, but the fitting ability was not as good as that of the dynamic neighborhood model. As the GC-LSTM required constant dynamic adjustment of neighbor relationships, the spatial feature fitting ability was better than TGCN, but not as good as the CTPURec.

The Splicing Efficiency of Taxi Pickup Path with Different Parameters
To clearly demonstrate the impact of sharing trips for taxis, we conducted experiments using path extensions of trips sharing a network within 15 s. First, we verified the

The Splicing Efficiency of Taxi Pickup Path with Different Parameters
To clearly demonstrate the impact of sharing trips for taxis, we conducted experiments using path extensions of trips sharing a network within 15 s. First, we verified the trips distribution variation for different maximum waiting delays δ, as shown in Figure 9a. The orders of Real within the time period 8:0:0-8:0:15 show that the destination and the origin of some orders were relatively close to each other. If the travel time cost between orders satisfied the δ constraint, the two orders could be spliced. The experiments were based on δ = 2 min and δ = 4 min, respectively, for the path splicing, and the connection start and end points of the spliced trips became intermediate points. Thus, the splicing path was used as a complete trajectory, and the fleet scale of taxis of the shared network was effectively reduced. It can be seen that more orders were spliced at δ = 4 min, and the taxi order volume was reduced to a smaller-scale continuous path at 15 s. . Example of a trip sharing network for 15s; the blue line is the path, the black marker indicates the origin, and the red is the destination of the trips. Figure 9a shows the shared network based on different δ and Figure 9b based on different k.
In addition to the limit on the maximum waiting delay, we further added a limit on the number of trips based on δ = 2 min, as shown in Figure 9b. The orders of Real within the time period 11:0:0-11:0:15 were mainly focused on suburban to downtown trips. The path splicing was performed on the basis of Real, and it can be seen that only a small reduction was performed for k = 2. For k = 4, a large number of orders were merged, resulting in a smaller fleet scale of taxis required for passenger travel, but increasing the additional cost of order splicing, such as the waiting time for splicing trips. Compared to the taxi order trajectories in Real, the trip distribution of the sharing network not only increased the order volume of the valid taxi, but also could have broken the distribution of the original orders and reallocated them in a uniform manner. The experiments illustrated that the order assignment strategy of the sharing network can reduce the taxi occupancy while satisfying the equal travel demands, which is achieved by recommending taxi orders on the spliced path for a given taxi.

Optimization for Taxi System with Continuous Pickup Paths
To evaluate the influence of pickup path recommendation for the taxi service system, the experiments evaluated the average taxi running time, taxi no-load time, average In addition to the limit on the maximum waiting delay, we further added a limit on the number of trips based on δ = 2 min, as shown in Figure 9b. The orders of Real within the time period 11:0:0-11:0:15 were mainly focused on suburban to downtown trips. The path splicing was performed on the basis of Real, and it can be seen that only a small reduction was performed for k = 2. For k = 4, a large number of orders were merged, resulting in a smaller fleet scale of taxis required for passenger travel, but increasing the additional cost of order splicing, such as the waiting time for splicing trips. Compared to the taxi order trajectories in Real, the trip distribution of the sharing network not only increased the order volume of the valid taxi, but also could have broken the distribution of the original orders and reallocated them in a uniform manner. The experiments illustrated that the order assignment strategy of the sharing network can reduce the taxi occupancy while satisfying the equal travel demands, which is achieved by recommending taxi orders on the spliced path for a given taxi.

Optimization for Taxi System with Continuous Pickup Paths
To evaluate the influence of pickup path recommendation for the taxi service system, the experiments evaluated the average taxi running time, taxi no-load time, average pickup waiting time, effective fleet size, and the average increased orders under different parameter configurations. The running time is the time span from the t o of the first passenger order to the t d of the last order on a natural day. The no-load time is the accumulation of the non-passenger carrying period during the operating hours. We counted the daily average running time and no-load time from 21-30 November 2016, as shown in Figure 10. The experiments adopted the maximum connection limit k = 2, 4, and the waiting delay δ was set to 2 min. From Figure 10a, the path splicing reduced the daily average running time, which was attributed to order sharing improving vehicle operating efficiency with fixed travel demand. With k = 4, the total number of vehicles was 18.54% less than that with k = 2, thus requiring more orders to be filled per vehicle. As a result, the average running time for k = 4 increased compared to k = 2. In Figure 10b, the taxi no-load time decreased as k increased, indicating that continuous pickup path recommendations reduced the no-load time of individual vehicles in general. The experiments showed that order-sharing networks reduced inefficient time consumption in taxi systems, allowing travel orders to be reallocated on-demand and increasing the efficiency of vehicle operation.   Figure 10a shows the average running time and Figure 10b shows the noload time.In the following, we compare the influences of different recommendation steps on the average pickup waiting time, the effective fleet size, and the average increased orders. Table 2 shows the variation of the daily average pickup waiting time for all vehicle orders over a 10-day period. The metric was averaged based on the total number of effective taxis running each day. According to the difference between the k = 2, 4, 8, 16 and the Real, it can be concluded that the splicing path reduced the average pickup time of taxis. Since the fleet size of taxis decreased at k = 4 and the travel path linked more orders, the average pickup waiting time decreased. At k = 8,16, although the fleet size was further reduced, the waiting time was increased to connect more orders. Therefore, the range of k was not as large as possible and could bring inefficiency when exceeding the threshold. The order trajectory at k = 2 preferred to add additional orders to the existing order distribution, which was different from the reassignment. At k = 4, the subsequent orders were augmented around the first order, and the orders that originally belonged to it were reassigned. As can be seen in Table 3, the effective fleet size decreased as k increased, but the decreasing trend gradually decayed. The average effective fleet size from Real to k = 2 decreased by 9606, while from k = 8 to k = 16 it decreased by just 5068.  In the following, we compare the influences of different recommendation steps on the average pickup waiting time, the effective fleet size, and the average increased orders. Table 2 shows the variation of the daily average pickup waiting time for all vehicle orders over a 10-day period. The metric was averaged based on the total number of effective taxis running each day. According to the difference between the k = 2, 4, 8, 16 and the Real, it can be concluded that the splicing path reduced the average pickup time of taxis. Since the fleet size of taxis decreased at k = 4 and the travel path linked more orders, the average pickup waiting time decreased. At k = 8,16, although the fleet size was further reduced, the waiting time was increased to connect more orders. Therefore, the range of k was not as large as possible and could bring inefficiency when exceeding the threshold. The order trajectory at k = 2 preferred to add additional orders to the existing order distribution, which was different from the reassignment. At k = 4, the subsequent orders were augmented around the first order, and the orders that originally belonged to it were reassigned. As can be seen in Table 3, the effective fleet size decreased as k increased, but the decreasing trend gradually decayed. The average effective fleet size from Real to k = 2 decreased by 9606, while from k = 8 to k = 16 it decreased by just 5068.

The Implementation Phase
Considering that, in the actual scenario, there are multiple taxi operators, and it is difficult to unify drivers' recognition for sharing orders under the carbon neutral target, we tend to develop an order-sharing mode on existing taxi-hailing platforms. The driver has the flexibility to choose whether to turn on the mode or not, and the mode can be dynamically adjusted according to the period. First of all, the order-sharing mode has no impact on passengers' travel demands and existing travel options. The CTPURec is only trained on the historical trajectories of taxis in sharing mode and generates continuous pickup paths for drivers under this customized service. In contrast, the free mode means that the drivers do not participate in the sharing program and use the original method to obtain orders.
Next, we analyzed the taxi system performance of CTPURec in partial sharing mode. The drivers were first divided into five groups with different proportions based on free mode and sharing mode, as shown in the abscissa axis of Figure 11. We built the ordersharing network based on scenarios 1 -7 , respectively, so as to guide the taxi path in the partial sharing mode. The experiments adopted the trip data of 25 November 2016, with the maximum waiting delay δ set to 4 min and the maximum number of splices k = 2, 4. The variation trends of different scenarios of the system performance index are shown in Figure 11. It can be seen that for the taxi average running time and average pickup waiting time, there was a decreasing trend as the proportion of drivers participating in the sharing program grew. For Figure 11a, the running time of taxis in k = 2 was lower for the same travel demands. This is due to the fact that the fleet size of k = 2 was higher than k = 4 after splicing the paths according to the configuration, as shown in Figure 11c. For Figure 11b, the k = 4 used more continuous pickup paths, reducing the waiting time to search for the next order. Therefore, the average cost of acquiring the next order for scenarios 2 -7 was better than the fully free mode. For effective fleet size, it continued shrinking as the proportion of the sharing mode increased. The fleet size was minimized for k = 4 in fully sharing mode. At this point, the taxi fleet improved its own order-taking efficiency while reducing the cost of no-loading in free mode, as the total amount of travel demand was fixed. Finally, we compared the order gains at different proportions, as shown in Figure 11d. This order gain was complementary to the reduction in effective fleet size. As the sharing proportion increased, the order gain gradually improved until the fully sharing mode was reached. mode to sharing mode. At the same time, the impact of the sharing mode on taxi drivers' habits is minimal, without requiring them to understand the implementation of the sharing mode but simply referring them to the recommended path. Figure 11. Impact of the different driver's percentage participating in sharing mode. Figure 11a is the average running time, and Figure 11b is the average pick-up waiting time. Figure 11c,d are the effective fleet size and average increased orders based on different percentage of taxi driver's following sharing mode.6. Conclusions and Future Work.
In this paper, we proposed a macro taxi path recommendation model to guide the taxis' travel behaviors by adopting the concept of low-carbon operation, which is an important contribution for transportation system emission reduction under the goal of "carbon neutrality." The coarse-grained extraction of order data was used to mine the flow pattern of taxis in the transportation network for the perception of future travel demands. A shared trajectory network was constructed based on potential travel trips, guiding taxis to pick up passengers with a shared collaborative path decision. The combination of path recommendation and a sharing network not only selects the appropriate continuous path for taxis, but also promotes the balanced utilization of travel services.
In the experiment, we compared the predictive validity of the model, the path recommendation effects, and the impact on the taxi operation system. Combined with the carbon effect in the operation of the transportation system, we applied some system-level metrics that are positively correlated with carbon emissions to measure the usability of the recommended model. The results showed that continuous pickup paths can increase taxi orders while reducing the scale of vehicle operation, and this optimization has important guidance for upgrading low-carbon management in the transportation field.
Considering the application of our low-carbon sharing mode to existing ride-hailing  Therefore, the partial order-sharing mode can also contribute to the operational efficiency of the taxi system when full sharing cannot be achieved. This implementation process also promotes a transformation for the driver's low-carbon perception from the free mode to sharing mode. At the same time, the impact of the sharing mode on taxi drivers' habits is minimal, without requiring them to understand the implementation of the sharing mode but simply referring them to the recommended path.

Conclusions and Future Work
In this paper, we proposed a macro taxi path recommendation model to guide the taxis' travel behaviors by adopting the concept of low-carbon operation, which is an important contribution for transportation system emission reduction under the goal of "carbon neutrality." The coarse-grained extraction of order data was used to mine the flow pattern of taxis in the transportation network for the perception of future travel demands. A shared trajectory network was constructed based on potential travel trips, guiding taxis to pick up passengers with a shared collaborative path decision. The combination of path recommendation and a sharing network not only selects the appropriate continuous path for taxis, but also promotes the balanced utilization of travel services.
In the experiment, we compared the predictive validity of the model, the path recommendation effects, and the impact on the taxi operation system. Combined with the carbon effect in the operation of the transportation system, we applied some system-level metrics that are positively correlated with carbon emissions to measure the usability of the recommended model. The results showed that continuous pickup paths can increase taxi orders while reducing the scale of vehicle operation, and this optimization has important guidance for upgrading low-carbon management in the transportation field.
Considering the application of our low-carbon sharing mode to existing ride-hailing technologies, we developed a preliminary implementation plan and conducted a quantitative sensitivity analysis. However, the evaluation metrics we used for taxi system operation are not equivalent to the carbon effect evaluation. In future work, we will use the specific carbon metrics to construct the path recommendation model to suggest the cruising path for taxis and the path choices for users. By further establishing a carbon effect incentive mechanism, the order-sharing network is deeply integrated with existing travel services to promote the transformation of taxi behaviors from free mode to sharing mode.
Author Contributions: Writing-original draft, Mengmeng Chang; validation, Yuanying Chi; resources, Zhiming Ding; data curation, Yuanying Chi; formal analysis, Jing Tian and Yuhao Zheng; writing-review and editing, Yuhao Zheng. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by Limin Guo.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available from www.openstreetmap. org and gaia.didichuxing.com both accessed on 7 April 2021.