Exploring the Factors of Intercity Ridesplitting Based on Observed and GIS Data: A Case Study in China

: Ridesplitting, a form of ridesourcing in which riders with similar origins and destinations are matched, is an effective mode of sustainable transportation. In recently years, ridesplitting has spread rapidly worldwide and plays an increasingly important role in intercity travel. However, intercity ridesplitting has rarely been studied. In this paper, we use observe intercity ridesplitting data between Yinchuan and Shizuishan in China and building environment data based on a geographic information system (GIS) to analyse temporal, spatial and other characteristics. Then, we divide the study area into grids and explore the contributing factors that affect the intercity ridesplitting matching success rate. Based on these signiﬁcant factors, we develop a binary logistic regression (BLR) model and predict the intercity ridesplitting matching success rate. The results indicate that morning peak, evening peak, weekends and weekdays, precipitation and snowfall, population density, some types of points of interest (POI), travel time and the advance appointment time are signiﬁcant factors. In addition, the prediction accuracy of the model is more than 78%, which shows that the factors studied in this paper have good explanatory power. The results of this study can help in understanding the characteristics of intercity ridesplitting and provide a reference for improving the intercity ridesplitting matching success rate.


Introduction
In recent years, owing to the popularity of smart phones, ridesourcing services have developed rapidly. For example, Didi, Uber, Lyft and other ridesourcing companies are gradually rising in popularity [1]. Additionally, in China, ridesourcing has become one of the main ways people travel in their everyday lives [2]. In the past several years, DiDi Chuxing has served more than 450 million users with a full range of mobility services across 400 cities in China, including Express, Hitch, Taxi and so on [3]. However, a large number of ridesourcing trips may also aggravate traffic congestion and result in more pollution. Therefore, as a shared mobility service, ridesplitting has gradually attracted people's attention.
Ridesplitting refers to a form of ridesourcing in which riders with similar origindestination (OD) points are matched to the same ridesourcing driver and vehicle in real time, and the ride and costs are split among the users [4]. Ridesplitting can cut the costs incurred by passengers, reduce the number of cars required for each trip, and increase the number of passengers per vehicle. Studies have also proven that ridesplitting is an important means of alleviating traffic congestion, reducing traffic-based carbon emissions 2 of 21 and improving environmental benefits [5,6]. There is no doubt that ridesplitting is one of the effective modes of sustainable transportation.
Currently, some ridesourcing companies are committed to increasing the proportion of ridesplitting services. Take Didi as an example: by the end of 2019, their cumulative number of users reached 2.9 billion, with a compound annual growth rate of 143.3%. The volume of passengers using ridesplitting services was equivalent to 1.2 times the civil aviation passenger volume in 2019 [7].
Intercity ridesplitting is a kind of ridesplitting in which the origin and destination are in two different cities. It is a promising solution to intercity road congestion [8] and can effectively avoid the waste of empty vehicles, which is an important function of a sustainable transportation system. Compared with intracity ridesplitting, intercity ridesplitting has a longer shared distance, meaning that intercity ridesplitting can save passengers more money and that it has greater environmental value. In addition, the matching process for intercity ridesplitting is simpler than that for short-distance ridesplitting because the scheduling constraints are lower [9]. Furthermore, intercity ridesplitting, which serves as a supplement to public transport [10], is more flexible in terms of the departure time and location than intercity public transport, such as high-speed rail trains and intercity coaches. Therefore, intercity ridesplitting is very important and meaningful.
At present, some scholars have researched the characteristics and influencing factors of intracity ridesplitting. Wang et al. [11] and Li et al. [12] analysed the characteristics of ridesplitting. Chen et al. [13] quantified the impact of ridesplitting on multi-modal urban mobility, and Chen et al. [14] presented an ensemble learning approach to predicting users' ridesplitting choices. Xu et al. [6] and Tu et al. [15] explored the influencing factors affecting ridesplitting. However, current research on intercity ridesplitting is scarce. On the one hand, there is a lack of data, and the related publicly available data sets consist almost exclusively of intracity data. On the other hand, in the past, there was less travel between cities in China, but with rapid economic development, especially the emergence of urban agglomerations, intercity travel has become normalized [16]. Therefore, it is important to study intercity ridesplitting.
For ridesourcing companies and urban traffic managers, forecasting the intercity ridesplitting matching success rate and exploring the contributing factors are important. Ridesourcing companies can formulate corresponding dispatching strategies based on the prediction results, which can greatly improve efficiency and service quality. In addition, the factors that contribute to the matching success rate have great reference value for urban traffic managers to formulate policies related to intercity traffic in urban agglomerations.
In this paper, we focus on analysing the characteristics of intercity ridesplitting and exploring the contributing factors that affect the intercity ridesplitting matching success rate. We use the observed intercity ridesplitting data between Yinchuan and Shizuishan, China, and local building environment data based on a geographic information system (GIS). Yinchuan is the provincial capital and central city of Ningxia, while Shizuishan is a city in Ningxia that has been affected by the political and economic radiation of Yinchuan. Our main contributions are as follows. First, we analyse in depth the temporal, spatial and other characteristics of intercity ridesplitting. Second, we use GIS tools to divide the study area into grids and explore the influence of potential factors on the intercity ridesplitting matching success rate. Third, we develop a binary logistic regression (BLR) model and predict the intercity ridesplitting matching success rate based on these significant factors.
The remainder of this paper is organized as follows. Section 2 provides a review of the existing literature related to this study. Section 3 describes the dataset, analyses the characteristics of successful and failed intercity ridesplitting, and discusses the possible influence of different factors on the matching success rate. Section 4 presents the methodology and variables used in this study. Section 5 discusses the results in detail and provides some suggestions to improve the intercity ridesplitting matching success rate. The final section summarizes the main findings and limitations of this study and proposes future research directions.

Existing Studies on Sidesourcing
The existing literature mainly focuses on exploring the travel characteristics and influencing factors of ridesourcing, which can provide targeted suggestions for policy makers [16,17]. In terms of travel characteristics, using point of interest (POI) data and DiDi Chuxing data, Bi and Ye [1] analysed the ridesourcing travel characteristics of areas with different spatial attributes and summarized the rules of the daily trips taken by people in Chengdu. Based on the results of user surveys, Rayle et al. [18] and Zhen et al. [19] compared the characteristics of ridesourcing and taxis and found that ridesourcing had shorter waiting times and travel times as well as younger users. Their results showed that social and recreational trips were the predominant types of trips for which ridesourcing was used. Lavieri et al. [20] utilized ridesourcing data and found that different income segments in the population may use ridesourcing for different activity purposes. Wang et al. [21] analysed customers' order cancellation behaviour and observed that the mean ride distance and pick-up distance of cancelled orders were obviously longer than those of completed orders during the same time period.
In research on the factors affecting ridesourcing, the influencing factors of demand are mainly studied. Sun and Ding [2], Yu and Peng et al. [22] and Yan et al. [23] explored the relationship between ridesourcing demand and different influencing factors of the built environment. The results showed that land use, infrastructure traffic accessibility, road density and other factors had a great impact on ridesourcing demand. Ghaffar et al. [24] and Dias et al. [25] studied the influence of various exogenous socio-economic and demographic variables on ridesourcing demand. The results indicated that ridesourcing users tended to be well-educated, higher-income, working individuals residing in higher-density areas. Bao et al. [26] utilized Global Positioning System (GPS) trajectory data to compare the influencing factors of ridesourcing services and taxi services. They found that ridesourcing services mainly catered to young people, while regular taxi services were more likely to be accepted by middle-aged people. In addition, the number of parking spaces was positively correlated with ridesourcing. Some studies have also researched influencing factors other than demand. Yang et al. [27] studied the influencing factors of the ridesourcing waiting time and found that the waiting time was positively related to trip-level characteristics and negatively related to road density and transit stop density.
In addition, a few studies have analysed the impact of carpooling on other travel modes. Kong et al. [9], Hoffmann et al. [28], Henao [29] and Feigon and Murphy [30] explored the impact of ridesourcing on public transportation. They found that ridesourcing had a higher substitution effect on public transport in downtown areas, while a complementary effect appeared more in suburban areas with poor public transport coverage. Furthermore, in the case of subway service interruptions, the use of ridesourcing increases by more than 30%. Nie [31] examined the impact of ridesourcing on the taxi industry by using taxi GPS trajectory data, and the evidence showed that shocks were relatively short.

Existing Studies on Ridesplitting
Although the literature on ridesourcing is relatively abundant, there are few studies on ridesplitting. In particular, ridesharing and ridesplitting are both shared trips, and their meanings are similar. Ridesharing refers to a shared ride where passengers and drivers have similar origins and destinations [32][33][34], but in ridesplitting, two or more passengers have similar origins and destinations. Some studies do not make a detailed distinction between them, but in this paper, ridesplitting does not include ridesharing. In terms of ridesplitting research, some studies are based on user surveys. Wang et al. [10] investigated young users and analysed the characteristics of their ridesplitting behaviour and its impacts on an emerging ridesourcing platform. Chen et al. [12] analysed ridesourcing behaviour based on survey data and quantified the impact of ridesplitting on multi-modal urban mobility. The results obtained based on surveys are helpful in understanding carpooling services, but actual observational data are still needed to objectively describe ridesplitting behaviour and to explore its rules.
Since ridesplitting companies seldom disclose their actual data, only a few studies have used actual observational data to research ridesplitting behaviour in recent years. Based on real trip data provided by the Hangzhou Didi platform, an ensemble learning approach was presented to predict users' ridesplitting choices, and the ReliefF algorithm was used to rank and select a variety of features that may impact ridesplitting behaviour [13]. Additionally, the studies byLi et al. [11], Tu et al. [14] and Tu et al. [35] were all based on Chengdu ridesourcing data provided by DiDi Chuxing. Li et al. [11] developed a ridesplitting trip identification algorithm and studied the characteristics of ridesplitting trips in Chengdu. The results showed that shared rides and single rides had different spatiotemporal patterns and travel characteristics. Meanwhile, most ridesplitting trips consist of two shared rides that may mainly serve non-commuting trips. Tu et al. [14] used a gradient boosting decision tree (GBDT) model to explore the nonlinear effects of the built environment on ridesplitting origins and destinations. Tu et al. [35] proposed a ridesplitting trip identification algorithm based on a shareability network to quantify the potential of ridesplitting. The results showed that the potential cost savings of ridesplitting can reach 18.47%, while the actual cost savings are only 1.22%. Ridesplitting still has great potential. Based on ridesplitting data from Chicago, Xu et al. [6] studied spatial changes in ridesplitting and found large variations in ridesplitting adoption rates across neighbourhoods. Furthermore, they analysed the factors affecting these changes and suggested that socio-economic and demographic variables are the main influencing factors.
However, there is almost no research on intercity ridesplitting. We know of one study related to long-distance ridesplitting services. Zhu et al. [36] built a multi-modal commute model and analysed the impact of long-distance ridesplitting services on public transit. The results indicate that a considerable number of public transit users will switch to long-distance ridesplitting when the fare rate is low.
In summary, there are few studies on the characteristics and influencing factors of intercity ridesplitting. This situation leads to unclear answers regarding the characteristics of intercity ridesplitting and the factors affecting the intercity ridesplitting matching success rate.

Data Source
The data used in this paper includes observed intercity ridesplitting data and the building environment data based on GIS. The observed intercity ridesplitting data is from Shizuishan Chuxing. Shizuishan Chuxing is an intercity ridesplitting platform that provides intercity travel services to residents of Shizuishan. It has been officially operating since 2018. The data set contains data on intercity ridesplitting rides between Shizuishan and the provincial capital, Yinchuan. The study area is shown in Figure 1. Shizuishan is a city in the Ningxia Autonomous Region, China, while Yinchuan is the capital of the province and one of the important central cities in Northwest China. Additionally, it has the only international airport in the Ningxia Autonomous Region. Shizuishan and Yinchuan are closely connected, and Shizuishan is affected by the political, economic and traffic radiation of Yinchuan, with a large number of residents taking trips between the two cities every day. The data set includes 212,246 intercity ridesplitting rides from 1 January 2020, to 31 December 2020. The main fields are the order ID, driver ID, longitude, latitude, price, departure time, arrival time, creation time and weather (some sample data are shown in Table 1).
The building environment data has been measured and calculated in detail using a GIS tool, which contains POI data, bus station density and population density [37]. The POI data in this article are from Baidu Map [38], including six types of POIs: residential land, administration and public service land, commercial and business facility land, industrial land, street and transportation land and green space and square land.

Scale Analysis of the Intercity Ridesplitting Data
Although the data sets are all from an intercity ridesplitting platform, considering that intercity ridesplitting may fail to match users and drivers and, as a result, ridesplitting rides may turn into single rides, this article first processes the data sets based on a ridesplitting trip identification algorithm [11]. To distinguish trips and rides, we define trips from the driver perspective and rides from the passenger perspective. Therefore, intercity single rides are defined as matching failed intercity ridesplitting in which passengers have no partners to share the ride with and bear the cost alone, and intercity ridesplitting rides are defined as matching successful intercity ridesplitting in which passengers share the ride and cost with other passengers. In addition, intercity single trips are defined as when a driver has only one order, which is equivalent to an intercity single ride, and  The building environment data has been measured and calculated in detail using a GIS tool, which contains POI data, bus station density and population density [37]. The POI data in this article are from Baidu Map [38], including six types of POIs: residential land, administration and public service land, commercial and business facility land, industrial land, street and transportation land and green space and square land.

Scale Analysis of the Intercity Ridesplitting Data
Although the data sets are all from an intercity ridesplitting platform, considering that intercity ridesplitting may fail to match users and drivers and, as a result, ridesplitting rides may turn into single rides, this article first processes the data sets based on a ridesplitting trip identification algorithm [11]. To distinguish trips and rides, we define trips from the driver perspective and rides from the passenger perspective. Therefore, intercity single rides are defined as matching failed intercity ridesplitting in which passengers have no partners to share the ride with and bear the cost alone, and intercity ridesplitting rides are defined as matching successful intercity ridesplitting in which passengers share the ride and cost with other passengers. In addition, intercity single trips are defined as when a driver has only one order, which is equivalent to an intercity single ride, and intercity ridesplitting trips are defined as when a driver has multiple orders that are composed of two or more intercity ridesplitting rides.
The results show that in this data set, the intercity ridesplitting matching success rate is as high as 89.88%. Among the 212,244 rides, a total of 190,769 intercity ridesplitting rides are identified, constituting 62,207 intercity ridesplitting trips, with an average of 3.07 passengers per trip. Compared with most intracity ridesplitting trips, which have two passengers [11], the number of passengers in intercity ridesplitting is significantly higher, which warrants the attention of intercity ridesplitting companies. Intercity ridesplitting may need larger cars to better meet the demand.
The number of intercity single rides is 21,475, and there may be two reasons for the matching failure of intercity ridesplitting. First, the platform cannot match other ridesplitting passengers in the same area and the same time period. Second, other ridesplitting passengers temporarily cancel their orders, but the platform is unable to match new passengers in time. When matching fails, passengers can choose to continue to wait for a new ridesplitting partner. However, the data show that 94.88% of the passengers in intercity ridesplitting rides have a waiting time of less than 30 min, and only 5.12% of passengers wait more than 30 min, but all passengers have a waiting time of less than 60 min. This finding shows that the patience of passengers is limited and that passengers without successful matching may try to wait for other new passengers. However, they wait until their patience is exhausted, which eventually leads to the production of single rides.
In the next part of this section, we conduct a more in-depth analysis of the characteristics of intercity ridesplitting. By comparing the data on the successful matching and failed matching of intercity ridesplitting, we explore various potential factors that affect the intercity ridesplitting matching success rate, including temporal and spatial factors, the travel time and weather factors.

Temporal Factors Analysis
We compared the departure time distributions of ridesplitting rides and single rides. Limited by the regulations of the platform, the departure time of all orders is between 6:00 and 21:00, as shown in Figure 2. intercity ridesplitting trips are defined as when a driver has multiple orders that are composed of two or more intercity ridesplitting rides.
The results show that in this data set, the intercity ridesplitting matching success rate is as high as 89.88%. Among the 212,244 rides, a total of 190,769 intercity ridesplitting rides are identified, constituting 62,207 intercity ridesplitting trips, with an average of 3.07 passengers per trip. Compared with most intracity ridesplitting trips, which have two passengers [11], the number of passengers in intercity ridesplitting is significantly higher, which warrants the attention of intercity ridesplitting companies. Intercity ridesplitting may need larger cars to better meet the demand.
The number of intercity single rides is 21,475, and there may be two reasons for the matching failure of intercity ridesplitting. First, the platform cannot match other ridesplitting passengers in the same area and the same time period. Second, other ridesplitting passengers temporarily cancel their orders, but the platform is unable to match new passengers in time. When matching fails, passengers can choose to continue to wait for a new ridesplitting partner. However, the data show that 94.88% of the passengers in intercity ridesplitting rides have a waiting time of less than 30 min, and only 5.12% of passengers wait more than 30 min, but all passengers have a waiting time of less than 60 min. This finding shows that the patience of passengers is limited and that passengers without successful matching may try to wait for other new passengers. However, they wait until their patience is exhausted, which eventually leads to the production of single rides.
In the next part of this section, we conduct a more in-depth analysis of the characteristics of intercity ridesplitting. By comparing the data on the successful matching and failed matching of intercity ridesplitting, we explore various potential factors that affect the intercity ridesplitting matching success rate, including temporal and spatial factors, the travel time and weather factors.

Temporal Factors Analysis
We compared the departure time distributions of ridesplitting rides and single rides. Limited by the regulations of the platform, the departure time of all orders is between 6:00 and 21:00, as shown in Figure 2. We see that the distributions of departure times for ridesplitting rides and single rides have different characteristics. First, on Monday mornings, ridesplitting rides have an unusual morning peak. A large number of passengers travel from one city to another We see that the distributions of departure times for ridesplitting rides and single rides have different characteristics. First, on Monday mornings, ridesplitting rides have an unusual morning peak. A large number of passengers travel from one city to another through ridesplitting from 6:00 to 7:00 on Monday mornings, which may indicate that these passengers live and work in different cities. Therefore, they need to use ridesplitting to go to work from their city of residence on Monday mornings, and they choose the earliest ridesplitting departure time to arrive before work.
Second, on Friday and Sunday nights, intercity ridesplitting passengers also have a concentrated distribution. There is an evening peak from 16:00 to 18:00 on Fridays, and there is also a peak from 15:00 to 17:00 on Sundays. The reason may be that residents who get off work in the city where they work on Friday use ridesplitting to return to their city of residence and return on Sunday night. It may also be that residents use their weekend time to travel to another city, leave after work on Friday night and return on Sunday night.
Third, regarding Saturdays, although the trips are not very concentrated, there are obvious morning peaks (9:00-10:00) and evening peaks (16:00-17:00). At the same time, the number of rides per hour on Saturdays is greater than the average number of rides. The reason may be that passengers choose to travel to another city on Saturday, or some passengers for whom it is too late to return to their city of residence on Friday choose to return on Saturday.
Overall, except for the morning peak on Mondays, there are generally two intercity ridesplitting peaks, namely, the morning peak from 9:00 to 10:00 and the evening peak from 16:00 to 17:00. Therefore, ridesplitting companies can arrange more drivers during rush hours, especially Monday mornings and Friday and Sunday evenings, to meet the intercity ridesplitting needs of passengers.
In the distribution of departure times for single rides, there is no obvious difference from Mondays to Sundays. Among them, the morning peak from 9:00 to 10:00 is very obvious, which is consistent with ridesplitting. However, the distribution for single rides from 14:00 to 20:00 is uniform, and there is no obvious evening rush hour, which is different from ridesplitting.
Considering the central position of Yinchuan, passengers departing from the two cities are likely to have different distribution patterns. To further explore this issue, we compared the distributions of departure times from Yinchuan and Shizuishan, as shown in Figure 3.
Comparing ridesplitting rides between Figure 3a,b, we find that Yinchuan is more attractive than Shizuishan. The distributions in the two cities have an obvious morning peak from 6:00 to 7:00 on Monday, but the number of passengers from Shizuishan is far higher than that from Yinchuan, almost double. This result may indicate that residents prefer to work in Yinchuan but settle in another city, even if it will considerably increase their commuting time and transportation costs. This finding is also confirmed by the characteristics of the two cities on Fridays and Sundays. The number of passengers returning from Yinchuan to Shizuishan on Friday nights is more concentrated, and more passengers leave Shizuishan for Yinchuan on Sunday nights. However, at the same time, many passengers depart from the opposite direction on Monday mornings, Friday nights and Sunday nights. The reason may be that there are also many residents living in Yinchuan but working in Shizuishan. However, in comparison, Shizuishan is not as attractive as Yinchuan. Therefore, intercity ridesplitting companies can arrange more drivers for one city at different times based on the attractiveness of the two cities. Figure 3c,d show that intercity single rides also have morning and evening peaks, but the characteristics in the two cities are significantly different. Passengers who failed to engage in ridesplitting from Yinchuan were mainly concentrated in the evenings. In contrast, Shizuishan has a higher failure rate in mornings. This may be because these residents temporarily depart from Shizuishan in the morning and return in the evening, making it difficult to match other passengers with similar ODs. Considering the time required for intercity travel, these rides may be mainly non-commuter rides.
obvious, which is consistent with ridesplitting. However, the distribution for single rides from 14:00 to 20:00 is uniform, and there is no obvious evening rush hour, which is different from ridesplitting.
Considering the central position of Yinchuan, passengers departing from the two cities are likely to have different distribution patterns. To further explore this issue, we compared the distributions of departure times from Yinchuan and Shizuishan, as shown in Figure 3.
(a) (b)  Comparing ridesplitting rides between Figure 3a,b, we find that Yinchuan is more attractive than Shizuishan. The distributions in the two cities have an obvious morning peak from 6:00 to 7:00 on Monday, but the number of passengers from Shizuishan is far higher than that from Yinchuan, almost double. This result may indicate that residents prefer to work in Yinchuan but settle in another city, even if it will considerably increase their commuting time and transportation costs. This finding is also confirmed by the characteristics of the two cities on Fridays and Sundays. The number of passengers returning from Yinchuan to Shizuishan on Friday nights is more concentrated, and more passengers leave Shizuishan for Yinchuan on Sunday nights. However, at the same time, many passengers depart from the opposite direction on Monday mornings, Friday nights and Sunday nights. The reason may be that there are also many residents living in Yinchuan but working in Shizuishan. However, in comparison, Shizuishan is not as attractive as Yinchuan. Therefore, intercity ridesplitting companies can arrange more drivers for one city at different times based on the attractiveness of the two cities. Figure 3c,d show that intercity single rides also have morning and evening peaks, but the characteristics in the two cities are significantly different. Passengers who failed to engage in ridesplitting from Yinchuan were mainly concentrated in the evenings. In contrast, Shizuishan has a higher failure rate in mornings. This may be because these residents temporarily depart from Shizuishan in the morning and return in the evening, making it difficult to match other passengers with similar ODs. Considering the time required

Spatial Factors Analysis
We measure the spatial location of intercity ridesplitting data and the local environment building data in detail based on GIS. To explore the potential impact of different spatial factors on the intercity ridesplitting matching success rate, we use ArcGIS to calculate the distributions of population density and bus stop density in Figure 4, which are widely used to analyse the impact of geographical factors on traffic [24]. Population density refers to the number of people per square kilometer, and bus stop density is defined as the number of bus stops per square kilometer.  In Figure 4, the shades of color represent the population density and bus stop density, while Pingluo and Yongning Counties are painted in grey because they are not included in the data set. The size of each ring chart represents the intercity ridesplitting demand, and red indicates the proportion of single rides.
First, Figure 4 shows that intercity ridesplitting demand is mainly concentrated in areas with high population density and bus stop density. In Yinchuan, Jinfeng and Xingqing have the highest demand for intercity ridesplitting. As the centre of Yinchuan, these two districts also have the highest density of population and bus stops. They are followed by Xixia and Lingwu, and Helan has the least demand. In Shizuishan, most of the intercity ridesplitting demand comes from Dawukou, which has a higher density, and only a small part of the total demand comes from Huinong.
Second, in Yinchuan, the higher the population density and bus stop density are, the lower the proportion of single rides. Jinfeng Xingqing and Xixia, which have the highest population density and bus stop density, have the lowest percentages of single rides, at 8.19%, 9.96% and 7.13%, respectively. The proportion of single rides in Helan, which has a lower population density and bus stop density, is 11.23%, while the proportion of single rides in Lingwu, which has the lowest population density and bus stop density, is the highest, reaching 24.59%. This finding may be because the higher the population density and bus stop density are, the denser the distribution of intercity ridesplitting demand, which makes it easier for residents to match with other passengers with similar OD points.
However, in Shizuishan, this rule does not hold. Dawukou, which has a higher population density and bus stop density, also has a higher proportion of single rides than Huinong. This higher proportion is probably caused by the special travel purpose of passengers in Huinong. In the data set, all passengers departing from Huinong have the international airport in Lingwu as their destination. This consistent travel purpose greatly improves the intercity ridesplitting matching success rate. Although the population density and bus stop density in Dawukou are higher, compared with Huinong, the spatial In Figure 4, the shades of color represent the population density and bus stop density, while Pingluo and Yongning Counties are painted in grey because they are not included in the data set. The size of each ring chart represents the intercity ridesplitting demand, and red indicates the proportion of single rides.
First, Figure 4 shows that intercity ridesplitting demand is mainly concentrated in areas with high population density and bus stop density. In Yinchuan, Jinfeng and Xingqing have the highest demand for intercity ridesplitting. As the centre of Yinchuan, these two districts also have the highest density of population and bus stops. They are followed by Xixia and Lingwu, and Helan has the least demand. In Shizuishan, most of the intercity ridesplitting demand comes from Dawukou, which has a higher density, and only a small part of the total demand comes from Huinong.
Second, in Yinchuan, the higher the population density and bus stop density are, the lower the proportion of single rides. Jinfeng Xingqing and Xixia, which have the highest population density and bus stop density, have the lowest percentages of single rides, at 8.19%, 9.96% and 7.13%, respectively. The proportion of single rides in Helan, which has a lower population density and bus stop density, is 11.23%, while the proportion of single rides in Lingwu, which has the lowest population density and bus stop density, is the highest, reaching 24.59%. This finding may be because the higher the population density and bus stop density are, the denser the distribution of intercity ridesplitting demand, which makes it easier for residents to match with other passengers with similar OD points.
However, in Shizuishan, this rule does not hold. Dawukou, which has a higher population density and bus stop density, also has a higher proportion of single rides than Huinong. This higher proportion is probably caused by the special travel purpose of passengers in Huinong. In the data set, all passengers departing from Huinong have the international airport in Lingwu as their destination. This consistent travel purpose greatly improves the intercity ridesplitting matching success rate. Although the population density and bus stop density in Dawukou are higher, compared with Huinong, the spatial distribution of passenger destinations is more scattered, resulting in a lower probability of passengers matching with other passengers.
In addition to population density and bus stop density, we also identify six POIs from GIS database and present the spatial distribution of POIs in Figure 5 and the spatial distribution of intercity ridesplitting in Figure 6 using ArcGIS. All types of POI density refer to the number of POIs per square kilometer. ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 10 of 21 distribution of passenger destinations is more scattered, resulting in a lower probability of passengers matching with other passengers. In addition to population density and bus stop density, we also identify six POIs from GIS database and present the spatial distribution of POIs in Figure 5 and the spatial distribution of intercity ridesplitting in Figure 6 using ArcGIS. All types of POI density refer to the number of POIs per square kilometer.  Comparing Figures 5 and 6, we find that, except for green space and square land, the spatial distribution of the other five types of POIs coincides with the spatial distribution of intercity ridesplitting demand to varying degrees. Regarding the distribution of the six types of POIs, administration and public service land and commercial and business facility land are more dispersed, residential land and street and transportation land are very close, industrial land is more concentrated, and green space and square land is distributed in blocks. The coincidence between the distribution of POIs and that of intercity ridesplitting demand indicates that POI factors may have a certain impact on intercity ridesplitting demand. Some scholars have found that there is an obvious positive relationship between ridesourcing demand and the number of restaurants [23]. However, the specific impact of these built environmental factors on the intercity ridesplitting matching success rate is not yet known. Thus, we still need to use quantitative methods to explore the relationship between them.

Travel Factors Analysis
The travel time is obtained by the difference between the departure time and the arrival time. We present the distribution of intercity ridesplitting travel times and the proportion of intercity ridesplitting matching failures in Figure 7. Comparing Figures 5 and 6, we find that, except for green space and square land, the spatial distribution of the other five types of POIs coincides with the spatial distribution of intercity ridesplitting demand to varying degrees. Regarding the distribution of the six types of POIs, administration and public service land and commercial and business facility land are more dispersed, residential land and street and transportation land are very close, industrial land is more concentrated, and green space and square land is distributed in blocks. The coincidence between the distribution of POIs and that of intercity ridesplitting demand indicates that POI factors may have a certain impact on intercity ridesplitting demand. Some scholars have found that there is an obvious positive relationship between ridesourcing demand and the number of restaurants [23]. However, the specific impact of these built environmental factors on the intercity ridesplitting matching success rate is not yet known. Thus, we still need to use quantitative methods to explore the relationship between them.

Travel Factors Analysis
The travel time is obtained by the difference between the departure time and the arrival time. We present the distribution of intercity ridesplitting travel times and the proportion of intercity ridesplitting matching failures in Figure 7.
The results show the following two trends: first, the shorter the travel time is, the higher the intercity ridesplitting matching failure rate. The distribution of intercity ridesplitting travel times is close to a normal distribution, and most travel times are concentrated in the range of 90 min-160 min. Based on the relationship between the probability density of travel time and the intercity ridesplitting matching failure rate, we divide Figure 7 into three parts. In part 1, the travel time is less than 100 min, and the intercity ridesplitting matching failure rate is also very high, higher than 10%. In less than 10 min, the failure rate is as high as 25.9%. In part 2, the travel time is between 100 min and 170 min, and the failure rate remains at a relatively low level, less than 10%. In part 3, the travel time is more than 180 min, and the failure rate hovers between 10% and 12%. The failure rate in part 1 is significantly higher than that in part 2 and part 3. This finding may be because when the travel time is longer, passengers will travel a longer distance, and it will be easier to cover other passengers' origins and destinations. Therefore, the probability of matching with other passengers is greater. The results show the following two trends: first, the shorter the travel time is, the higher the intercity ridesplitting matching failure rate. The distribution of intercity ridesplitting travel times is close to a normal distribution, and most travel times are concentrated in the range of 90 min-160 min. Based on the relationship between the probability density of travel time and the intercity ridesplitting matching failure rate, we divide Figure 7 into three parts. In part 1, the travel time is less than 100 min, and the intercity ridesplitting matching failure rate is also very high, higher than 10%. In less than 10 min, the failure rate is as high as 25.9%. In part 2, the travel time is between 100 min and 170 min, and the failure rate remains at a relatively low level, less than 10%. In part 3, the travel time is more than 180 min, and the failure rate hovers between 10% and 12%. The failure rate in part 1 is significantly higher than that in part 2 and part 3. This finding may be because when the travel time is longer, passengers will travel a longer distance, and it will be easier to cover other passengers' origins and destinations. Therefore, the probability of matching with other passengers is greater.
Second, when the demand for a certain travel time is greater, the intercity ridesplitting matching success rate is higher. In Figure 7, most of the intercity ridesplitting demand is distributed in part 2, and the intercity ridesplitting matching failure rate is also the lowest. Although the travel time in part 3 is longer than that in part 2, part 3 should have a higher matching success rate, but the demand for intercity ridesplitting in this part is also lower, which in turn leads to an increase in the matching failure rate. Therefore, for the two groups in part 1 and part 3 with less demand, ridesplitting companies can set up reservation services to help passengers match with ridesplitting partners. Second, when the demand for a certain travel time is greater, the intercity ridesplitting matching success rate is higher. In Figure 7, most of the intercity ridesplitting demand is distributed in part 2, and the intercity ridesplitting matching failure rate is also the lowest. Although the travel time in part 3 is longer than that in part 2, part 3 should have a higher matching success rate, but the demand for intercity ridesplitting in this part is also lower, which in turn leads to an increase in the matching failure rate. Therefore, for the two groups in part 1 and part 3 with less demand, ridesplitting companies can set up reservation services to help passengers match with ridesplitting partners.
In summary, when the travel time is longer and the demand for that travel time is greater, the intercity ridesplitting matching success rate is higher.
The advance appointment time is obtained by the difference between the creation time and the departure time. We show the distribution of advance appointment times in Figures 8 and 9. The results indicate that passengers always book intercity ridesplitting shortly before departure, which may be because most trips are instant trips. The matching failure rate tends to increase as the advance appointment time becomes longer, which may be because passengers in active ridesplitting areas have a high matching success rate; thus, they seldom choose to make long-term appointments in advance. be because passengers in active ridesplitting areas have a high matching success rate; thus, they seldom choose to make long-term appointments in advance.

Weather Factors Analysis
We compare the intercity ridesplitting matching failure rate under different weather conditions in Figure 10. be because passengers in active ridesplitting areas have a high matching success rate; thus, they seldom choose to make long-term appointments in advance.

Weather Factors Analysis
We compare the intercity ridesplitting matching failure rate under different weather conditions in Figure 10. Matching failure rate (%) Advance appointment time (h) Figure 9. Changes in the matching failure rate for different advance appointment times.

Weather Factors Analysis
We compare the intercity ridesplitting matching failure rate under different weather conditions in Figure 10. In Figure 10, the intercity ridesplitting matching success rate g under precipitation and snowfall increases slightly. In rainy and snowy weather, regardless of whether the city is Yinchuan or Shizuishan, the intercity ridesplitting matching failure rate is lower, and in the absence of rain and snow, the matching failure rate is greater. However, at the same time, the matching success rate has a limited increase in rain and snow. We cannot conclude that rainy and snowy weather has a positive effect on the matching success rate, which still needs a quantitative method to determine the relationship between them.

Binary Logistic Regression
BLR is a commonly used statistical method [39]. It can explore the relationship between each independent variable and the dependent variable through regression analysis and predict the dependent variable. The dependent variable, Y, is a binary variable; usually, when an event occurs, Y = 1, and when an event does not occur, Y = 0. Suppose that there are n determiners, that is, x i 1,2, … , n . The BLR model is as follows: where β is a constant and β i 1,2, … , n is the coefficient of each independent variable, which can reflect the influence of each independent variable on the dependent variable. When β 1, the independent variable has a promoting effect on the occurrence of the dependent variable event. When β 1, it has an inhibitory effect. β 1 means that the independent variable has no effect on the occurrence of the event.
Since there are obvious differences in intercity ridesplitting characteristics between the two directions, we use BLR to establish a Yinchuan model and a Shizuishan model to explore the factors influencing the intercity ridesplitting matching success rate in the two cities.

Variable Descriptions
We explore the factors that affect the intercity ridesplitting matching success rate. Therefore, we use the matching success or matching failure of each intercity ridesplitting Shizuishan Yinchuan In Figure 10, the intercity ridesplitting matching success rate g under precipitation and snowfall increases slightly. In rainy and snowy weather, regardless of whether the city is Yinchuan or Shizuishan, the intercity ridesplitting matching failure rate is lower, and in the absence of rain and snow, the matching failure rate is greater. However, at the same time, the matching success rate has a limited increase in rain and snow. We cannot conclude that rainy and snowy weather has a positive effect on the matching success rate, which still needs a quantitative method to determine the relationship between them.

Binary Logistic Regression
BLR is a commonly used statistical method [39]. It can explore the relationship between each independent variable and the dependent variable through regression analysis and predict the dependent variable. The dependent variable, Y, is a binary variable; usually, when an event occurs, Y = 1, and when an event does not occur, Y = 0. Suppose that there are n determiners, that is, x i (i = 1, 2, . . . , n). The BLR model is as follows: where β 0 is a constant and β i (i = 1, 2, . . . , n) is the coefficient of each independent variable, which can reflect the influence of each independent variable on the dependent variable. When β i > 1, the independent variable has a promoting effect on the occurrence of the dependent variable event. When β i < 1, it has an inhibitory effect. β i = 1 means that the independent variable has no effect on the occurrence of the event.
Since there are obvious differences in intercity ridesplitting characteristics between the two directions, we use BLR to establish a Yinchuan model and a Shizuishan model to explore the factors influencing the intercity ridesplitting matching success rate in the two cities.

Variable Descriptions
We explore the factors that affect the intercity ridesplitting matching success rate. Therefore, we use the matching success or matching failure of each intercity ridesplitting trip as the dependent variable of the BLR. The dependent variable is a binary variable that takes the value of 0 in the case of matching failure, which is a single ride, and takes the value of 1 in the case of matching success, which is a ridesplitting ride.
The independent variables in this paper include temporal variables, spatial variables, travel variables and weather variables. In terms of space attributes, we find that population density and bus stop density may have a positive effect on the intercity ridesplitting matching success rate, and the distribution of POIs also overlaps with the distribution of intercity ridesplitting to a certain extent. To explore the impact of these built environmental factors on the intercity ridesplitting matching success rate, firstly, we divide the study area into 500 m × 500 m grids based on GIS, as shown in the Figure 11. Then, we count the building environment factors in GIS database. After data normalization, the population density, bus stop density and the number of six POIs in each grid are included in the range of independent variables. trip as the dependent variable of the BLR. The dependent variable is a binary variable that takes the value of 0 in the case of matching failure, which is a single ride, and takes the value of 1 in the case of matching success, which is a ridesplitting ride. The independent variables in this paper include temporal variables, spatial variables, travel variables and weather variables. In terms of space attributes, we find that population density and bus stop density may have a positive effect on the intercity ridesplitting matching success rate, and the distribution of POIs also overlaps with the distribution of intercity ridesplitting to a certain extent. To explore the impact of these built environmental factors on the intercity ridesplitting matching success rate, firstly, we divide the study area into 500 m × 500 m grids based on GIS, as shown in the Figure 11. Then, we count the building environment factors in GIS database. After data normalization, the population density, bus stop density and the number of six POIs in each grid are included in the range of independent variables. In the analysis of temporal characteristics, we find that in different time periods of the day or on weekdays and weekends, the intercity ridesplitting demand and single rides have different distributions. Therefore, we take the time attribute variables as part of the independent variable. From Figure 2a, we find that there are generally two intercity ridesplitting peaks, namely, the morning peak from 9:00 to 10:00 and the evening peak from 16:00 to 17:00. According to this feature, we divide the time into morning peak, evening peak and flat peaks at other times, which are always used in traffic study [40,41]. The divided periods are regarded as two binary variables. A value of 1 means in this period, 0 means not in this period. At the same time, weekdays or weekends are also regarded as a binary explanatory variable that takes the value of 1 for weekends and 0 for weekdays. In the analysis of temporal characteristics, we find that in different time periods of the day or on weekdays and weekends, the intercity ridesplitting demand and single rides have different distributions. Therefore, we take the time attribute variables as part of the independent variable. From Figure 2a, we find that there are generally two intercity ridesplitting peaks, namely, the morning peak from 9:00 to 10:00 and the evening peak from 16:00 to 17:00. According to this feature, we divide the time into morning peak, evening peak and flat peaks at other times, which are always used in traffic study [40,41]. The divided periods are regarded as two binary variables. A value of 1 means in this period, 0 means not in this period. At the same time, weekdays or weekends are also regarded as a binary explanatory variable that takes the value of 1 for weekends and 0 for weekdays.
In the analysis of travel factors, we find that the longer the travel time is, the lower the intercity ridesplitting matching failure rate. However, more evidence is needed to support this finding. Therefore, to verify this finding, we take the travel time as one of the independent variables. Because of the obvious trend of the advance appointment time, we also take it as an independent variable.
Regarding weather factors, the intercity ridesplitting matching failure rate under rainy and snowy weather is slightly higher. Therefore, we regard precipitation and snowfall as a binary explanatory variable that takes the value of 1 for precipitation and snowfall and 0 otherwise. Temperature has been proven to be an influencing factor of ridesourcing demand [23], but whether temperature also affects the intercity ridesplitting matching success rate is not yet known. Thus, we also normalize the daily local average temperature and add it as an independent variable.

Significant Factors Influencing the Intercity Ridesplitting Matching Success Rate
We used BLR to calculate the factors affecting the intercity ridesplitting matching success rate in Yinchuan and Shizuishan, and the significance of the two models in terms of omnibus tests of the model coefficients is less than 0.05 for both results, which means that the models are reasonable. The results of the independent factors of the two models are shown in Tables 2 and 3. The factors with significance less than 0.05 are significant factors, and a significance of 0.000 means that their significance is less than 0.001. All significant factors have been marked with asterisks.  Based on the results of the models, we conclude as follows: (1) The weekend, morning peak and evening peak variables are all significant in both models. In the two models, the results of the weekends or weekdays binary variable are significant and positive, which shows that the intercity ridesplitting matching success rate on weekends is higher. This finding may be because on weekends, people have more entertainment and leisure needs, or they may travel between the two cities where they live and work, thus generating more demand. Regardless of which explanation is correct, it is easier for people to match with other passengers with similar OD points. The morning peak variable has a positive effect in the Yinchuan model but negative effects in the Shizuishan model. This result is consistent with the relationship we found. It may be because as the provincial capital, Yinchuan attracts people from Shizuishan for political and economic reasons, which make Shizuishan residents take more temporary trips to the provincial capital's government agencies, airports and commercial facilities in the morning. These temporary trips have no regular time and place, so it is more difficult to match other passengers. In the evening, residents have more ridesplitting demand after get off work, and it is easier to find passengers with similar OD. Therefore, the impact of evening peak factors in both models is positive.
(2) Among the space factors, population density has a positive effect on the intercity ridesplitting matching success rate, and some types of POIs are also significant factors. In both models, the greater the population density is, the higher the intercity ridesplitting matching success rate. This result is in line with people's expectations. In densely populated areas, there is more demand for intercity ridesplitting, and it is easier for people to find partners. Bus stop density has a positive effect in the Shizuishan model, but it is not significant in the Yinchuan model. Considering that the operating range of buses is within a city, its impact on intercity ridesplitting is unclear. Regarding POI factors, commercial and business facility land and street and transportation land promote the intercity ridesplitting matching success rate, while residential land and industrial land inhibit it. In response to the impact of these built environmental factors, ridesplitting companies can provide discounts or rewards in areas with low matching success rates to attract more passengers and improve the intercity ridesplitting matching success rate.
(3) The travel time, advance appointment time and average temperature variables are significant. The travel time plays a positive role in the model, while the advance appointment time has a negative effect. The results are consistent with those in Section 3.3.3. Therefore, ridesplitting companies can try to improve the algorithm to match more shortdistance trips into long-distance trips with overlapping OD points, thereby reducing the intercity ridesplitting matching failure rate for short-distance trips. In addition, they can expand the matching range for passengers with long advance appointment times to improve their matching success rate. Among the weather factors, temperature is significant and negative. Some scholars have found that when the temperature is lower, the demand for ridesourcing is greater [28], and more demand may improve the intercity ridesplitting matching success rate. However, the precipitation and snowfall binary variable is not significant, indicating that precipitation and snowfall have no definite influence on the intercity ridesplitting matching success rate.

Prediction Results
The prediction results of the two models are shown in Tables 4 and 5: Table 4. Prediction results of the Yinchuan model.

Actual Measurement Prediction Accuracy
Matching failed intercity ridesplitting (Y = 0) 70.1% Matching successful intercity ridesplitting (Y = 1) 85.7% Average prediction accuracy 79.9% The BLR predictive fitting degree can be explained by the pseudo R 2 of the model. In this paper, we use this method to predict the intercity ridesplitting matching success rate in Yinchuan and Shizuishan. The pseudo R 2 values of the Yinchuan and Shizuishan models are 0.417 and 0.455, respectively. In general, when the pseudo R 2 > 0.2, it shows a relatively good fit [42].
The prediction accuracy of the two models is more than 78%, which shows that the factors studied in this paper have good explanatory power with regard to the intercity ridesplitting matching success rate. At the same time, the prediction accuracy of matching successful intercity ridesplitting is more accurate.
We use GIS tools to visualize the spatial distribution of the prediction accuracy in the central urban area of two cities, as shown in the Figure 12. The prediction accuracy of each grid refers to the average accuracy of all travel prediction results in the grid. The results show that in the central urban areas of the two cities, the number of grids with high prediction accuracy is large and the proportions are close, which is consistent with the prediction results of two models.
In this paper, we use observed intercity ridesplitting data between Yinchuan and Shizuishan in China and building environment data based on a GIS to explore the characteristics of intercity ridesplitting under the influencing factors of intercity ridesplitting. Then, we study the significant factors affecting the intercity ridesplitting matching success rate based on a BLR model. Finally, we predict the intercity ridesplitting matching success rate in the two cities. In this paper, we use observed intercity ridesplitting data between Yinchuan and Shizuishan in China and building environment data based on a GIS to explore the characteristics of intercity ridesplitting under the influencing factors of intercity ridesplitting. Then, we study the significant factors affecting the intercity ridesplitting matching success rate based on a BLR model. Finally, we predict the intercity ridesplitting matching success rate in the two cities.
First, the characteristics of the matching success and matching failure of intercity ridesplitting are compared and analysed. In terms of time characteristics, intercity ridesplitting rides are the most concentrated from 6:00 to 7:00 on Mondays and from 16:00 First, the characteristics of the matching success and matching failure of intercity ridesplitting are compared and analysed. In terms of time characteristics, intercity ridesplitting rides are the most concentrated from 6:00 to 7:00 on Mondays and from 16:00 to 17:00 on Friday and Sunday nights, while intercity single rides have two small peaks at 9:00-10:00 and 18:00-19:00 every day. In the analysis of spatial factors, population density and bus stop density positively coincide with the intercity ridesplitting matching success rates in the two cities. Regarding other factors, when passengers' travel time is longer and the demand during that travel time is greater, the intercity ridesplitting matching success rate is higher.
Second, we study the influence of temporal, spatial, travel time and weather factors on the intercity ridesplitting matching success rate based on a BLR model. The results show that passengers are more likely to match successfully on weekends. In the morning peak, the matching success rate from Yinchuan, the provincial capital, is higher, and that from Shizuishan is lower. In addition, population density has a positive impact on the intercity ridesplitting matching success rate. Furthermore, commercial and business facility land and street and transportation land promote the intercity ridesplitting matching success rate, while residential land and industrial land inhibit it. In addition, an increase in travel time and the occurrence of precipitation and snowfall will improve the matching success rate.
Third, we utilize BLR to predict the intercity ridesplitting matching success rate in Yinchuan and Shizuishan. The results show that the prediction accuracy of the Yinchuan and Shizuishan models reached 78.7% and 79.9%, respectively. This prediction accuracy also indicates that our regression model based on temporal, spatial, travel time and weather factors can well explain the reasons for the intercity ridesplitting matching success rate.
The significance of ridesplitting lies in the fact that it cuts the costs incurred by passengers and reduces traffic carbon emissions. In this paper, the characteristics and influencing factors of intercity ridesplitting are important references for ridesplitting companies and transportation policy makers. However, the current study has several limitations. First, the data set includes intercity ridesplitting data for only two cities. To verify the results of this study, we still need more data on urban agglomerations in China and around the world. Second, although this paper has explored and analysed some built environmental factors and obtained some conclusions, our research is still not deep enough due to the lack of data. If we want a more complete explanation of the intercity ridesplitting matching success rate, we need to add more potential factors, including the number of private cars and so on. In the future, we will collect more data and use GIS tools to conduct a more in-depth analysis of the factors that affect the intercity ridesplitting matching success rate and demand. Considering that the data set includes the COVID-19 pandemic from the beginning to the calm in 2020, we will study the impact of the epidemic on ridesplitting in the future.  Data Availability Statement: The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.