Mining Multimodal Travel Mobilities with Big Ridership Data: Comparative Analysis of Subways and Taxis

: Understanding traveler mobility in cities is signiﬁcant for urban planning and trafﬁc management. However, most traditional studies have focused on travel mobility in a single trafﬁc mode. Only limited studies have focused on the travel mobility associated with multimodal transportation. Subways are considered a green travel mode with large capacity, while taxis are an energy-consuming travel mode that provides a personalized service. Exploring the relationship between subway mobility and taxi mobility is conducive to building a sustainable multimodal transportation system, such as one with mobility as a service (MaaS). In this study, we propose a framework for comparatively analyzing the travel mobilities associated with subways and taxis. Firstly, we divided taxi trips into three groups: competitive, cooperative, and complementary. Voronoi diagrams based on subway stations were introduced to divide regions. An entropy index was adopted to measure the mix of taxi trips. Secondly, subway and taxi trip networks were constructed based on the divided regions. The framework was tested based on the automatic fare collection (AFC) data and global positioning system (GPS) data of a subway in Beijing, China. The results showed that the proportions of taxi competition, taxi cooperation, and taxi complements were 9.1%, 35.6%, and 55.3%, respectively. The entropy was large in the central city and small in the suburbs. Moreover, it was found that the subway trip network was connected more closely than the taxi network. However, the unbalanced condition of taxis is more serious than that of the subway.


Introduction
Subways and taxis are two significant components of public transport.In many sustainable cities, subways have become a mainstay due to their large capacity and high speed.Compared with other traffic modes with fixed routes, taxis can provide a personalized and flexible service.During the last decade, both subways and taxis have experienced fast development in many developing countries.Uncovering the traffic demand and the relationship between subways and taxis could help in many applications, such as city planning, traffic management, and geography [1][2][3][4].With rapid urban spatial expansion and traffic facility development, the quantity and structure of traffic flows are experiencing great changes.Traffic demand and the relationships between different traffic modes have drawn much attention [5][6][7].However, there has been no conclusion on whether there is competition or cooperation between subways and taxis.
Origin-destination (OD) flow describes the movement or trips between two locations, which plays a vital role in traffic and city planning [8,9].There are two kinds of OD flow data: point-based flow data and area-based flow data.Point-based flow data contain many potential locations, such as those from GPS trajectory data, while area-based flow data include migration between OD locations within a predefined area [10].The OD distribution is correlated with the population distribution, land use, and socioeconomic factors [11][12][13].Therefore, point-of-interest (POI) data play an important role in inferences of trip purpose and OD information [14,15].In urban transport systems, there are many types of traffic modes, such as subways, buses, taxis, bikes, and private cars.There are large differences in the OD demands of different traffic modes due to many socioeconomic factors.Urban planners have exerted great efforts to estimate OD demand distributions [16][17][18].
Questionnaire surveys are some of the most crucial sources for exploring multimodal traffic demand, and they reveal the macroscopic characteristics of urban populations [19].However, conducting a questionnaire survey is time-consuming and expensive.Therefore, it is hard to acquire most people's travel information.Big geospatial data, such as subway IC card data, vehicle GPS data, bike-sharing trip data, and mobile phone data, provide valuable and massive spatiotemporal information for grasping travelers' mobility [20][21][22][23].For traffic modes with fixed stations (e.g., subways and buses), traffic demand can be obtained through information swiped by passengers using their IC cards when they board and alight [24,25].For other traffic modes without stations, such as taxis and bike-sharing schemes, land is typically divided into regions that are considered to be nodes.The common divisions are grids [26,27], traffic analysis zones (TAZs) [28,29], Voronoi diagrams [30,31], and hexagons [32].In recent years, there have been numerous applications of these data for many aspects of data analysis, such as exploring travel mobility [33], traffic demand [34], and job-worker dynamics [35].Understanding the movement of people in their everyday lives is critical in transportation science.Complex network theory provides a powerful tool for exploring the structures and dynamics of traffic flow [36][37][38].
There are many difficulties in accurately mining OD demand and traveler behaviors.However, most previous studies have focused on single travel modes.People have a variety of travel mode options, including private cars, buses, subways, and electric bikes.There are few studies on the differences in mobility between metros and taxis.Furthermore, there is competition and cooperation among different travel modes, but there is no comprehensive framework for detecting it.Moreover, it is hard to grasp residents' travel characteristics with traditional surveys due to the aggregation and mobility of travelers.The goal of this study is to explore the travel characteristics of subways and taxis using big AFC data and GPS data.The contributions of this study contain two aspects: (1) the proposition of a framework for exploring the relationship between subways and taxis based on real trips and dividing taxi trips into a competitive element, a cooperative element, and a complementary element; (2) the adoption of a Voronoi diagram based on subway station information to construct a subway OD network and taxi OD network and to analyze their spatiotemporal characteristics.
The rest of this article is organized as follows.Section 2 introduces the literature review.Section 3 gives the methodology.Section 4 describes the data.The results are given in Section 5. Section 6 concludes the paper.

Applications of Big Data in Travel Mobility
Traffic-related big data, such as transit IC card data, the GPS data of taxis, trip data from bike-sharing schemes, and mobile phone data, play crucial roles in detecting traveler behavior, traffic status, origin-destination demand, and traffic model optimization.Transit IC card data have been widely used in analyzing travel mobility and OD demand [39][40][41].In some bus transit systems, only boarding information is recorded.The alighting information should be inferred to achieve an OD matrix [42].Exploring traffic flow structures and characteristics from the perspective of a complex network has recently drawn much attention [43,44].GPS-based big traffic data from taxis, bike-sharing schemes, and mobile phones have been successfully applied in traffic characteristic detection [45][46][47][48], traffic monitoring [49,50], traffic emission inferences [51], and so on.Mobile phone data have the characteristics of a large volume and spatiotemporal continuity, making them a good source for exploring mobility with multimodal transportation.Bachir et al. proposed a two-step semi-supervised learning algorithm for identifying transport modes with their mobile network trajectories and studied dynamic origin-destination flows for each transport mode [52].Zhang et al. introduced a deep multi-scale learning model for classifying transportation modes and speeds, which can help in understanding the mobility of moving objects [53].

Origin-Destination Demand Estimation
OD demand estimation plays a key role in city planning, traffic planning, and traffic management, making it an indispensable component of human mobility [54].The main methods of studying OD demand estimation are model-based methods and data-driven methods.The gravity model is one of the most famous model-based methods and has been applied to buses [55], taxis [56], aviation [57], and so on.To grasp the underlying mechanisms of people's mobility, many models have been proposed, including the maximum entropy model [58], intervening opportunities model [59], spatial interaction model [60], and free utility model [61].The maximum entropy model is a macroscopic explanation of the social gravity law, but it is not able to reflect individuals' choices or behaviors.More recent models have attempted to include microscopic individual choices to give better explanations.In the last two decades, data-driven methods for OD demand estimation have become hotspots.Krishnakumari et al. proposed a data-driven method for estimating an OD matrix.The method includes two main components: prediction of the production and attraction time series and OD matrix estimation [62].

Multimode Travel Behavior Characteristics
People make various choices of traffic modes in their daily travel, such as using private cars, taking public transportation, or walking.It is hard to grasp the OD demand of each traffic mode.Traditional studies tend to use sampling and questionnaire surveys to obtain traffic information, including people's home locations, traffic modes, and so on.Cheng et al. studied the spatial heterogeneity in accessibility and transit mode choices and found that low transit accessibility in suburban areas is correlated with the use of public transportation [63].Ilahi et al. found that increasing the public transport frequency or creating bus priority lanes to reduce the travel time could increase the shares of these modes [64].There are many difficulties in obtaining an accurate OD demand.Questionnaire surveys are expensive and time-consuming; thus, they hinder the ability to obtain large amounts of traffic data.In addition, results that are based on questionnaire surveys are affected by the age, income, and other factors of interviewees [65,66].Automatically collected big data (e.g., from IC cards and GPS) provide a better source for understanding OD demand and travelers' behaviors.Numerous studies have focused on travelers' mobility when using a single traffic mode.However, the relationship of travel demand with multiple modes is not clear at present.Kim and Cho pointed out that bike-sharing schemes and public transit might compete with or promote each other, even within a city [67].Chan et al. studied the choices of and equity in multimodal public transport services in Hong Kong [68].Xu et al. studied a ride-sourcing service with autonomous vehicles (AVs) for transportation hubs in multimodal transport systems, and they found that differentiated pricing and fleet size management strategies for AVs would be beneficial.Transport network companies can benefit from a greater profit and enjoy a higher market share while public transit enjoys higher ridership [69].

Zone Entropy of Subways and Taxis
This study introduces zone entropy to the measurement of the mixed use of subways and taxis.We divided taxi trips into three types: competitive, cooperative, and complementary.Figure 1 shows an illustration of the taxi type division.Specifically, we built a buffer area around subway stations with radius r.Taxi trips with both origins and destinations in the buffer area are considered competitive trips.Taxi trips with origins or destinations in the buffer area are considered cooperative trips, which means that passengers use taxi transfers to/from the subway.Otherwise, taxi trips are considered complementary to the subway system.In this study, we set r = 100 m as the radius of the buffer area.
destinations in the buffer area are considered competitive trips.Taxi trips with origins or destinations in the buffer area are considered cooperative trips, which means that passengers use taxi transfers to/from the subway.Otherwise, taxi trips are considered complementary to the subway system.In this study, we set r = 100 m as the radius of the buffer area.Then, we calculated the proportions of the three types of taxi trips in different zones.The zones were subdivisions of the studied area, which could be divided using many forms of division, such as grids, traffic analysis zones (TAZs), hexagons, and Voronoi diagrams.The Voronoi diagram is one of the most widely used methods for dividing areas in computational geometry.In this study, we used a Voronoi diagram to divide the studied area.The zone entropy of subways and taxis was calculated as ln( ) ln( ) where j p is the proportion of one taxi type (competitive, cooperative, or complemen- tary) in zone i , and k is the number of taxi types.A value of 0 means that the zone only contains one taxi type, while a value of 1 means that there is an equal distribution of all taxi types.

Network Construction
A Voronoi diagram is a subdivision of an area using a set of points called generating points or generators.Every point in the subdivided region is closer to one generator than to the other generators.Let , where ( , ) for generating point i a is defined as [31] { } 2 ( ) : : where x is the Euclidean norm of x in 2  .The formula states that points in the Vo- ronoi cell created by point i a are closer to i a than they are to other generators.In this study, we used subway stations as generating points to construct Voronoi diagrams dividing the study area.Taxi pick-up points or drop-off points in the Voronoisubdivided region are closest to station i a .A Voronoi diagram is good way to divide traffic regions when studying the relationship between subways and taxis.This is Then, we calculated the proportions of the three types of taxi trips in different zones.The zones were subdivisions of the studied area, which could be divided using many forms of division, such as grids, traffic analysis zones (TAZs), hexagons, and Voronoi diagrams.The Voronoi diagram is one of the most widely used methods for dividing areas in computational geometry.In this study, we used a Voronoi diagram to divide the studied area.The zone entropy of subways and taxis was calculated as where p j is the proportion of one taxi type (competitive, cooperative, or complementary) in zone i, and k is the number of taxi types.A value of 0 means that the zone only contains one taxi type, while a value of 1 means that there is an equal distribution of all taxi types.

Network Construction
A Voronoi diagram is a subdivision of an area using a set of points called generating points or generators.Every point in the subdivided region is closer to one generator than to the other generators.
where x is the Euclidean norm of x in R 2 .The formula states that points in the Voronoi cell V(a i ) created by point a i are closer to a i than they are to other generators.
In this study, we used subway stations as generating points to construct Voronoi diagrams dividing the study area.Taxi pick-up points or drop-off points in the Voronoisubdivided region V(a i ) are closest to station a i .A Voronoi diagram is good way to divide traffic regions when studying the relationship between subways and taxis.This is because many taxi travelers tend to use the nearest subway stations when they use the subway.Figure 2 shows an example of a Voronoi diagram based on subway stations (created randomly).Each subdivided region is considered as a node, and trips are regarded as edges.In this study, we created the Voronoi diagram based on real subway stations in Beijing using ArcGIS 10.3.
because many taxi travelers tend to use the nearest subway stations when they use the subway.Figure 2 shows an example of a Voronoi diagram based on subway stations (created randomly).Each subdivided region is considered as a node, and trips are regarded as edges.In this study, we created the Voronoi diagram based on real subway stations in Beijing using ArcGIS 10.In this study, we define a directed graph as ( , , ) , and N is the total , and ij w is the number of trips between i and node j .
In this part, we adopted several indicators to obtain the travel characteristics of subway and taxi trip networks.These indicators were calculated with Python 3.8.

Network Structure Measures
(1) Node degree.The node degree contains the in-degree and out-degree.The in-degree of node i is denoted as in i k , which means the number of nodes that point to node i ; the out-degree out i k denotes the number of nodes that node i points to.The degree of node i is the sum of the in-degree and out-degree.
1 In this study, we define a directed graph as G = (V, E, W) to represent the movement of travelers, where V is the set of nodes there is an edge between node i and node j; otherwise, e ij = 0. W is the set of weights of edges W := w ij i, j ∈ {1, 2, • • • , N} , and w ij is the number of trips between i and node j.
In this part, we adopted several indicators to obtain the travel characteristics of subway and taxi trip networks.These indicators were calculated with Python 3.8.

Network Structure Measures
(1) Node degree.The node degree contains the in-degree and out-degree.The indegree of node i is denoted as k in i , which means the number of nodes that point to node i; the out-degree k out i denotes the number of nodes that node i points to.The degree of node i is the sum of the in-degree and out-degree.
(2) Node strength.Similarly to the node degree, we define the node strength to reflect the strength of traffic flow.The in-strength s in i of node i denotes the amount of traffic flow to node i; the out-strength s out i is the amount of traffic flow from node i.The node strength is defined as the sum of the in-strength and out-strength.
(3) Betweenness.Betweenness is a significant indicator for measuring the importance of nodes in propagating information.It is defined as follows [70]: where σ st is the number of shortest paths going from s to t, and σ st (v) is the number of shortest paths going from s to t through node v.
(4) Pagerank.The Pagerank is a famous index for finding important nodes in graphs, and it can grasp global topological information.It is defined as follows [71]: where p i denotes the influence score of the ith node, p is a damping coefficient, k out j denotes the out-degree of the jth node, and e ji is an adjacency matrix.

Traffic Flow Disequilibrium Factor
Typically, the inbound and outbound traffic flows of a node are different for many reasons.In this study, the traffic flow disequilibrium factor (TFDF) of node i is defined as The traffic flow disequilibrium factor is defined as the ratio of the maximum value and the average value of the in-strength and out-strength.It ranges from 1 to 2. A larger value means a greater imbalance between the inbound and outbound flows of a node.

Transfers of Subway Flow
Average transfer times: In subway networks, transferring from one line to another line is a common phenomenon due to the large scale.Transfers increase travelers' travel time, thus reducing the transport efficiency of the subway.We used the average transfer time (ATT) to measure the transfer efficiency of subways in this study.It is defined as where w ij is the number of passengers from node i and node j; tr ij is the shortest transfer time between node i and node j, and it can be achieved with space P [72].

Data Description
The data were collected in Beijing, the capital city of China (see Figure 3).The total land area is 16,410 km 2 , and the population is 21.53 million.Figure 3 shows the subway stations with red nodes.Automatic fare collection (AFC) data from the subway and global positioning system (GPS) data were collected from 23 July 2018 to 29 July 2018.The main study area was located in the region ranging from 115.8538 E to 117.1420 E in longitude and from 39.4894 N to 40.4667 N in latitude.

Data Description
The data were collected in Beijing, the capital city of China (see Figure 3).The total land area is 16,410 km 2 , and the population is 21.53 million.Figure 3 shows the subway stations with red nodes.Automatic fare collection (AFC) data from the subway and global positioning system (GPS) data were collected from 23 July 2018 to 29 July 2018.The main study area was located in the region ranging from 115.8538 E to 117.1420 E in longitude and from 39.4894 N to 40.4667 N in latitude.Table 1 shows the main information in the AFC data from the Beijing subway, which included the card number, entry time, origin station, exit time, and destination station.The data were cleaned before use.The cleaning process included an abnormal flow check, the removal of duplicate data and missing data, and outlier detection.2 shows an example of the taxi GPS data.The data included the ID, time, longitude, latitude, instantaneous speed, and status.The status refers to whether a taxi was occupied or vacant.A value of '0' means that the taxi was vacant, and a value of '1' means that the taxi was occupied.Each taxi sent a record approximately every 30 s.We extracted the origin and destination of each trip based on the information.These data were preprocessed before their use.Records with missing information were removed.Moreover, trips that were more than 3 h or less than 5 min were defined as outlier data that were removed.Table 1 shows the main information in the AFC data from the Beijing subway, which included the card number, entry time, origin station, exit time, and destination station.The data were cleaned before use.The cleaning process included an abnormal flow check, the removal of duplicate data and missing data, and outlier detection.Table 2 shows an example of the taxi GPS data.The data included the ID, time, longitude, latitude, instantaneous speed, and status.The status refers to whether a taxi was occupied or vacant.A value of '0' means that the taxi was vacant, and a value of '1' means that the taxi was occupied.Each taxi sent a record approximately every 30 s.We extracted the origin and destination of each trip based on the information.These data were preprocessed before their use.Records with missing information were removed.Moreover, trips that were more than 3 h or less than 5 min were defined as outlier data that were removed.In this study, we collected point-of-interest (POI) data using a Python program from Baidu Map (https://map.baidu.com/)(accessed on 1 July 2018).The POI data included 14 categories: "restaurant", "scenic spot", "shopping", "transport facilities", "finance and insurance", "education, hotel", "health service", "company", "government", "residence", "life service", "public service", and "sport and recreation".

Temporal Characteristics of Travel Mobility
Figure 4 shows the distributions of the numbers of trips on the subway and those with competitive, cooperative, and complementary taxis.As can be seen, there were obvious morning peaks and evening peaks on the subway on weekdays.However, the taxi peaks happened during off-peak times, which indicated that passengers seldom chose taxis for commuting due to road congestion or higher costs.Moreover, one can observe that the numbers of passengers both on the subway and in taxis decreased on weekends.The proportions of competitive, cooperative, and complementary taxis were 9.3% (9.03%), 35.67% (34.1%), and 55.03% (56.88%), respectively, on weekdays (weekends).Notably, taxi trips outside of the subway service time were considered as complementary to the subway.

Temporal Characteristics of Travel Mobility
Figure 4 shows the distributions of the numbers of trips on the subway and those with competitive, cooperative, and complementary taxis.As can be seen, there were obvious morning peaks and evening peaks on the subway on weekdays.However, the taxi peaks happened during off-peak times, which indicated that passengers seldom chose taxis for commuting due to road congestion or higher costs.Moreover, one can observe that the numbers of passengers both on the subway and in taxis decreased on weekends.The proportions of competitive, cooperative, and complementary taxis were 9.3% (9.03%), 35.67% (34.1%), and 55.03% (56.88%), respectively, on weekdays (weekends).Notably, taxi trips outside of the subway service time were considered as complementary to the subway.

Spatial Characteristics of Travel Mobility
To detect the spatial characteristics of subway and taxi trips, we assessed the spatial OD flow distributions for subways and taxis, as shown in Figure 5.It was observed that

Spatial Characteristics of Travel Mobility
To detect the spatial characteristics of subway and taxi trips, we assessed the spatial OD flow distributions for subways and taxis, as shown in Figure 5.It was observed that both subway trips and taxi trips were concentrated in the central city, especially for competitive taxis.To some extent, the proportion of competitive taxis was smaller than those of cooperative and complementary taxis.Moreover, cooperative taxis played a key role in the central city and urban-rural fringe.In addition, it was found that the proportion of complementary taxis was the largest, which meant that there was relative independence between the subway and taxis.We found that taxis are a significant complement to the subway in the areas that do not have subway service.
both subway trips and taxi trips were concentrated in the central city, especially for competitive taxis.To some extent, the proportion of competitive taxis was smaller than those of cooperative and complementary taxis.Moreover, cooperative taxis played a key role in the central city and urban-rural fringe.In addition, it was found that the proportion of complementary taxis was the largest, which meant that there was relative independence between the subway and taxis.We found that taxis are a significant complement to the subway in the areas that do not have subway service.

Distribution of Travel Distances
Figure 6 shows the travel distance distributions of the subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday and a Sunday.The trip distances on the subway were concentrated between 5 km and 20 km, while the trip distances with taxis were concentrated between 1 km and 10 km.The median values of the subway and taxi trip distances were 12.85 km and 4.30 km.The median values of competitive, cooperative, and complementary taxi trips were 3.47 km, 4.77 km, and 4.16 km, respectively.The travel distances of competitive taxi trips were smaller than those of the other types of taxi trips.There was a small number of passengers using taxis rather than the subway when the travel distances were larger than 70 km.The reason for this is that passengers have to transfer when they travel long distances using the subway, which increases the travel time.Therefore, taxis are a better travel choice for long trips.

Distribution of Travel Distances
Figure 6 shows the travel distance distributions of the subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday and a Sunday.The trip distances on the subway were concentrated between 5 km and 20 km, while the trip distances with taxis were concentrated between 1 km and 10 km.The median values of the subway and taxi trip distances were 12.85 km and 4.30 km.The median values of competitive, cooperative, and complementary taxi trips were 3.47 km, 4.77 km, and 4.16 km, respectively.The travel distances of competitive taxi trips were smaller than those of the other types of taxi trips.There was a small number of passengers using taxis rather than the subway when the travel distances were larger than 70 km.The reason for this is that passengers have to transfer when they travel long distances using the subway, which increases the travel time.Therefore, taxis are a better travel choice for long trips.

Zone Entropy
Large values of zone entropy were distributed in the central area of Beijing, as shown in Figure 7a,b.Figure 7a,b show the spatial distributions of entropy on a Wednesday and a Sunday, respectively.In the suburbs, the values of zone entropy were small.This was

Zone Entropy
Large values of zone entropy were distributed in the central area of Beijing, as shown in Figure 7a,b.Figure 7a,b show the spatial distributions of entropy on a Wednesday and a Sunday, respectively.In the suburbs, the values of zone entropy were small.This was because the distributions of subway stations were sparse due to the low accessibility of the subway network.A small proportion of passengers could transfer to the subway using taxis.We also observed that there were more zones with large zone entropy values on weekends than on weekdays.The reason for this may be that passengers have more free time to travel, and they prefer to use mixed travel modes to reduce their travel costs.Figure 7c,d show the distributions of zone entropy values.It can be seen that most values were larger than 0.5, and there was a small proportion of zones with small entropy values, which indicates that the types of taxi trips were diverse in most areas.Figure 8 shows the entropy values and taxi types for each hour on a Wednesday and a Sunday.It can be seen in Figure 8a that the entropy values in the peak hours in the early morning and evening were slightly smaller than those in other time periods.Specifically, one can see in Figure 8b that the proportion of complementary taxi trips increased during these two time periods.The proportions of competitive, cooperative, and complementary taxi trips were 9.1%, 35.6%, and 55.3%, respectively.The results indicate that complementary taxi trips played a dominant role, especially at night when the subway was closed.It can be seen in the figures that there were no obvious differences between Wednesdays and Sundays.Figure 8 shows the entropy values and taxi types for each hour on a Wednesday and a Sunday.It can be seen in Figure 8a that the entropy values in the peak hours in the early morning and evening were slightly smaller than those in other time periods.Specifically, one can see in Figure 8b that the proportion of complementary taxi trips increased during these two time periods.The proportions of competitive, cooperative, and complementary taxi trips were 9.1%, 35.6%, and 55.3%, respectively.The results indicate that complementary taxi trips played a dominant role, especially at night when the subway was closed.It can be seen in the figures that there were no obvious differences between Wednesdays and Sundays.

Network-Based Features
In order to analyze the OD network characteristics of the subway and taxis, we calculated the main values of complex network indicators, as shown in Table 3.We found that the average degrees of the subway OD network and taxi OD network were 229.13 and 123.76 on Wednesdays, which indicated that the subway OD network had closer connections than the taxi OD network did.It was shown that the community structure of the taxi OD network was more obvious than that of the subway network.The AD and CC of the competitive taxi network were smaller than those of the other taxi types because the proportion of competitive taxi trips was smaller.

Subway Transfer Flow
With the expansion of the subway network, there are more and more passengers that need transfers to reach their destinations.However, transfers cost more time.Some pas-

Network-Based Features
In order to analyze the OD network characteristics of the subway and taxis, we calculated the main values of complex network indicators, as shown in Table 3.We found that the average degrees of the subway OD network and taxi OD network were 229.13 and 123.76 on Wednesdays, which indicated that the subway OD network had closer connections than the taxi OD network did.It was shown that the community structure of the taxi OD network was more obvious than that of the subway network.The AD and CC of the competitive taxi network were smaller than those of the other taxi types because the proportion of competitive taxi trips was smaller.

Subway Transfer Flow
With the expansion of the subway network, there are more and more passengers that need transfers to reach their destinations.However, transfers cost more time.Some passengers dislike transfers and shift to other travel modes, such as taxis and private cars.In this section, we quantitatively analyze transfers in the subway system.Figure 9 shows the traffic flow distribution among different lines and within the same lines.Only 19.42% of the total passengers can travel without transfers.Transfers can increase travel times, which decreases travel efficiency.In order to measure subway transfers, we calculated the transfer proportions in the subway network and the flow, as shown in Table 4. Firstly, we calculated the minimum transfer times between all nodes in space P.Then, the proportions of transfer times were calculated.Finally, the transfer proportions of passenger flow were given based on structural transfers.As can be seen in the network structure, a traveler can reach any station by transferring a maximum of four times.Actually, more than 98% of travelers travel with at most two transfers, only 0.19736% travel with three transfers, and no travelers travel with four transfers.The average number of transfers in the subway flow was 1.084.Therefore, managers should do their best to improve the subway network to reduce transfers and enhance its competitiveness.which decreases travel efficiency.In order to measure subway transfers, we calculated the transfer proportions in the subway network and the flow, as shown in Table 4. Firstly, we calculated the minimum transfer times between all nodes in space P.Then, the proportions of transfer times were calculated.Finally, the transfer proportions of passenger flow were given based on structural transfers.As can be seen in the network structure, a traveler can reach any station by transferring a maximum of four times.Actually, more than 98% of travelers travel with at most two transfers, only 0.19736% travel with three transfers, and no travelers travel with four transfers.The average number of transfers in the subway flow was 1.084.Therefore, managers should do their best to improve the subway network to reduce transfers and enhance its competitiveness.

Unbalanced Traffic Flow
The occurrence and attraction of regions are unbalanced due to diversity in the origin-destination demand.Figure 10 shows the unbalanced distributions of the subway flow and taxi flow.It can be seen that unbalanced nodes appeared in suburban areas for both the subway and taxis.Moreover, the imbalance in the taxi flow was more serious than that in the subway flow from the perspective of the indicators.The largest value for the subway was 1.406, while the smallest value for taxis was 1.408.

Unbalanced Traffic Flow
The occurrence and attraction of regions are unbalanced due to diversity in the origindestination demand.Figure 10 shows the unbalanced distributions of the subway flow and taxi flow.It can be seen that unbalanced nodes appeared in suburban areas for both the subway and taxis.Moreover, the imbalance in the taxi flow was more serious than that in the subway flow from the perspective of the indicators.The largest value for the subway was 1.406, while the smallest value for taxis was 1.408.

Correlations between Ridership and Socioeconomic Indicators
To better understand the relationship between ridership and socioeconomic indicators, we plotted a correlation heat map, as shown in Figure 11.As can be seen, the correlations between subway ridership and socioeconomic indicators were not strong.In contrast, some kinds of POIs, such as those related to finance and insurance, had obviously positive correlations with taxi ridership.Moreover, it was observed that there were no evident correlations between population and the ridership of the subway and taxis.

Correlations between Ridership and Socioeconomic Indicators
To better understand the relationship between ridership and socioeconomic indicators, we plotted a correlation heat map, as shown in Figure 11.As can be seen, the correlations between subway ridership and socioeconomic indicators were not strong.In contrast, some kinds of POIs, such as those related to finance and insurance, had obviously positive correlations with taxi ridership.Moreover, it was observed that there were no evident correlations between population and the ridership of the subway and taxis.

Correlations between Ridership and Socioeconomic Indicators
To better understand the relationship between ridership and socioeconomic indicators, we plotted a correlation heat map, as shown in Figure 11.As can be seen, the correlations between subway ridership and socioeconomic indicators were not strong.In contrast, some kinds of POIs, such as those related to finance and insurance, had obviously positive correlations with taxi ridership.Moreover, it was observed that there were no evident correlations between population and the ridership of the subway and taxis.

Discussions
The travel mobility of urban residents plays a key role in transportation planning and management.Traditional studies have focused on single travel modes [1,25,26].In reality, people can opt for many travel modes, such as subways, buses, and bikes.Previous studies have indicated that there are different temporal distributions of ridership between subways and taxis.The subway ridership shows a bimodal distribution with a morning peak and an evening peak [4].However, there is no morning peak in taxi ridership [21].People prefer to use the subway to commute and dislike using taxis due to the congested road conditions.At present, there is limited research on comparative analyses of multimodal transportation mobility.In this study, we propose a data-driven framework for analyzing the spatiotemporal characteristics of taxis and the subway.We divided taxi trips into competitive, cooperative, and complementary trips.It was found that the competition between the subway and taxis was only slight.However, there was a high level of cooperation between them.In addition, taxis are an important complement to the subway, especially in suburban areas.
The limitations of this study are as follows.Firstly, we only explored the spatiotemporal characteristics of subways and taxis.It was difficult to find determinants based on ridership data.In future studies, the use of a questionnaire can help in this regard.Secondly, this study only assessed two travel modes, but more travel modes (e.g., buses and bikes) should be studied together.Finally, we only used a Voronoi diagram to divide the study area.More types of traffic analysis zones should be considered.
In future studies, we intend to consider more travel modes to study mobility with multimodal transportation.Moreover, better traffic analysis zoning will be considered.In addition, more determinants will be considered.

Conclusions
Big data analysis provides new insights for understanding traffic demand.However, people meet the dilemma of exploring traffic demand between different travel modes when merging different types of data.In this study, a framework was proposed based on large amounts of subway AFC data and taxi GPS data to analyze traffic demand.Taxi trips were divided into three groups: competitive, cooperative, and complementary.Voronoi diagrams based on subway stations were introduced to divide the regions.An entropy index was adopted to measure the mix of taxi trips.Then, subway and taxi networks were constructed to analyze the traffic demand, where divided regions were considered as nodes, and trips between nodes were regarded as edges.
The results showed that there were two obvious peaks in the subway flow in the morning and afternoon, while taxi flow peaks were not evident.Moreover, there were similar distance distributions and very different flow structures between subway trips and taxi trips.It was found that the proportions of competitive, cooperative, and complementary taxis were 9.1%, 35.6%, and 55.3%, respectively.Furthermore, the entropy was large in the central city and small in the suburbs.Due to the fixed subway lines, more than 80% of subway passengers needed to transfer to other lines to reach their destinations.The average number of transfers on the subway was 1.084, and the maximum number of transfers was 3.Moreover, it was shown that the subway network was more closely connected than the taxi network.However, the imbalance in taxis was more serious than that in the subway.
The results indicated that there was less cooperation between the subway and taxis in suburban areas.Cooperation between different travel modes is very important when building a sustainable transportation system.For example, mobility as a service (MaaS) aims to integrate multimodal transportation into a system to reduce the use of private cars.This study suggests that managers should provide more transport facilities and policies to promote cooperation between different travel modes.This study can help in urban planning and traffic management; for example, managers should enhance the connectivity of the subway to reduce transfers.Moreover, the government should provide more traffic facilities in suburban areas to promote cooperation between the subway and taxis.In future

Figure 1 .
Figure 1.Illustration of the taxi type division.

Figure 1 .
Figure 1.Illustration of the taxi type division.

:
there is an edge between node i and node j ; otherwise, 0 ij e = .W is the set of weights of edges

Figure 3 .
Figure 3. Map of the subway network and taxi pick-ups and drop-offs in Beijing.

Figure 3 .
Figure 3. Map of the subway network and taxi pick-ups and drop-offs in Beijing.

Figure 5 .
Figure 5. Spatial OD flow distributions: (a-e) subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday; (f-j) subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Sunday.

Figure 5 .
Figure 5. Spatial OD flow distributions: (a-e) subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday; (f-j) subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Sunday.

Sustainability 2024 , 18 Figure 6 .
Figure 6.Travel distance distributions for the subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday and a Sunday.

Figure 6 .
Figure 6.Travel distance distributions for the subway, taxis, competitive taxis, cooperative taxis, and complementary taxis on a Wednesday and a Sunday.

Figure 7 .
Figure 7. Entropy distribution: (a) spatial distribution on a Wednesday; (b) spatial distribution on a Sunday; (c) value distribution on a Wednesday; (d) value distribution on a Sunday.

Figure 7 .
Figure 7. Entropy distribution: (a) spatial distribution on a Wednesday; (b) spatial distribution on a Sunday; (c) value distribution on a Wednesday; (d) value distribution on a Sunday.

Sustainability 2024 , 18 Figure 8 .
Figure 8. Entropy values during different hours of the day (a) and taxi type distribution on Wednesdays and Sundays (b,c).

Figure 8 .
Figure 8. Entropy values during different hours of the day (a) and taxi type distribution on Wednesdays and Sundays (b,c).

Figure 10 .
Figure 10.Unbalanced distributions of (a) subway flow and (b) taxi flow.

Figure 11 .
Figure 11.Heat map of correlations between ridership and socioeconomic indicators.

Figure 10 .
Figure 10.Unbalanced distributions of (a) subway flow and (b) taxi flow.

Figure 11 .
Figure 11.Heat map of correlations between ridership and socioeconomic indicators.

Figure 11 .
Figure 11.Heat map of correlations between ridership and socioeconomic indicators.

Table 1 .
Sample of the AFC data from the Beijing subway.

Table 1 .
Sample of the AFC data from the Beijing subway.

Table 2 .
Examples of the taxi GPS data.

Table 3 .
Values of complex network indicators for the subway, taxis, competitive taxis (T-competition), cooperative taxis (T-cooperation), and complementary taxis (T-complement) on weekdays and weekends.

Table 3 .
Values of complex network indicators for the subway, taxis, competitive taxis (T-competition), cooperative taxis (T-cooperation), and complementary taxis (T-complement) on weekdays and weekends.

Table 4 .
Proportions of transfer time for the subwayʹs structure and flow.

Table 4 .
Proportions of transfer time for the subway's structure and flow.