Measuring Spatial Mismatch between Public Transit Services and Regular Riders : A Case Study of Beijing

Public transit services should favor space equity, and the concern of this study is how the allocation of public transportation resources corresponds to the needs of transit users. Identifying mismatches between urban transit resources and regular transit users benefits the transportation resource allocation policy. This study introduces a location maximum likelihood estimation method and a cell space collector mechanism to explore distribution differences of regular transit riders and transit stations based on data mining. In Beijing, 5.37 million regular transit users were identified, and their first-morning transit stations were found to be within 2 km from their last transit stations used the day before. As their locations were estimated, differences in ratios of the regular transit riders to residents were found among areas. Most regular transit users were located in the suburban areas of 5–20 km from the center of Beijing, and the spatial distribution of transit stations declined from the center to the peripheral urban areas. This mismatch between public transit services and regular transit riders sheds light on urban transportation policies.


Introduction
Bus and metro rail systems play a significant role in urban areas and are regarded as effective means to promote urban development [1].Because investment in public transportation is usually based on public expenditures, public transit systems should serve the public equally [2].Spatial equity in public transit services is viewed as an essential component of the overall transportation planning and management process [3].A growing number of people rely on public transit systems, perceiving them as environmentally friendly and equitable [4].Researchers have always related public transport transits to resident population distribution.However, for socio-economic reasons, a few resident groups use public transit systems more frequently than others [5].Measuring the mismatch between public transit services and regular transit riders is important, and a reliable estimation of regular transit users' demands favors transit systems and urban development [6].It is a considerable challenge to provide efficient methods to measure how public transit services match public transit users [7].Conventional methods such as inhabitant surveys [8] and household interviews are often used to identify regular transit users and their locations [9].Due to its high efficiency, data mining of smart card data can be applied to identify commuting transit users [10], but few existing studies manage to match regular users with transportation resources.
This study aims to measure the mismatch between the distribution of transit stations and regular users, using a location maximum likelihood estimation (L-MLE) method and a mechanism of cell space collector (CSC).L-MLE is introduced to identify regular public transit riders and their locations from transit data.CSC is a mechanism based on which different data can be compared.A case study was conducted in Beijing that included transit, population census, and location data of bus stops and metro stations.Regular transit riders' locations were estimated, projected into cell space collectors, and compared with the public transit station distribution.The resulting mismatch was discussed from a policy making perspective.The paper is structured as follows: Section 2 gives a review of existing literature; Section 3 describes data, the methods of L-MLE the mechanism of SCC, and data description; Section 4 details the results of the case study; and Section 5 discusses the conclusions, policy implications, and limitations of this work.

Mismatch Between Public Transit Services and Regular Transit Riders
Equity of public transit services can be related to population to some extent, as well as spatial and social equity [11].Areas where residents heavily depend on the transit system often lack sufficient funding on transit [12].Improving the equity of public transit service is important to urban livability [13].Early studies used resident population data as the main reference in public transit service resource allocation, generally neglecting the fact that the frequency of using public transit systems differs among groups and areas [14].However, residents' need for public transport vary in different areas [15].Recent researchers have reported that elderly, low-income, and no-car households use public transport services more frequently [16,17].Low-income and medium-income populations tend to use the public transit system [9].Public transit users tend to choose to live in places where house prices are relatively low [18].Discrepancies in residents' transit behavior in different areas need to be explored [19].If an area is found to have high regular transit users with low transit service resources, attention should be paid to that place.It is either a place working class people prefer to live in or a place where the transit service resources are insufficient.Efforts to identify equity of public transit services will help policymakers discern overloaded areas in public transit service, adjust and rearrange transport assess points, and improve the accessibilities and equities of public transit service [20].Generally, there should be a balance between supply in public transit services and demands generated by residents [21].The literature indicates the significance of assessing the consistency of transit services with the distribution of regular transit riders.

Methods of Assessing Mismatch
Traditional estimations of public transport demands are primarily based on survey data [22], and the procedures are time-consuming and expensive to carry out [23].A few have obtained data from open sources to avoid tedious procedures.However, these data are limited in recognizing regular public transit users [15].There should be more efficient mechanisms than surveys and interviews to evaluate transit demands as well as service conditions in spatial perspectives.The use of data mining of smartcard data is becoming increasingly popular [24].Data mining of transit logs has proven useful in transit service demand estimation [25].Identifying regular public transit riders and locations using data mining is sufficiently addressed.
Studies show that human trajectories display a high degree of temporal and spatial regularity; each is characterized by a significant probability to return to a few frequent locations [26].According to Torsten Hägerstrand's theory of capability constraints, an individual needs a sheltered place to sleep at night, and one's activities are bound to certain places [27].However, the theory of capability constraints has not been fully employed to estimate the location of regular public transit riders based on bus and metro data.Another issue is comparing transit service resources with transit users.The distribution of bus stops and metro stations is an important indicator of public transit service assessment from the perspective of spatial equity [28].Spatial distribution of public transit services can be calculated based on the location of public transit stations available [29].Nevertheless, the data on transit service conditions and the data on transit riders' locations are available in different formats.The resident data are usually based on administrative borders, while the locations of transit stops are coordinate points.A spatial analysis tool can be designed to establish a connection between public transit demands and public transit service [30].

About the Data
Beijing has been chosen as the study area, as it relies highly on the public transit system.Transit data of August 2016 were collected, which included boarding stations, boarding time, alighting stations, and alighting time.Daily bus transit traffic volume and daily metro transit traffic volume added to a maximum of 20 million commuters on a single day.The Beijing Municipal Administration and Communication Card (commonly known as Yikatong) is the only smart card accepted in Beijing public transit system.Although transit riders use Yikatong anonymously, each anonymous card has a unique identification number.The cardholders' daily commute can be tracked, and whether a card is frequently used can be easily judged.Thus, the frequent locations of the transit riders can be analyzed.Approximately 16 million active Yikatong IDs were found in a month, with the population of Beijing being approximately 20 million people.It is estimated that approximately 80% of the population have Yikatong cards.
Resident distribution data were also collected as denominators to calculate the ratios of the regular transit riders to residents.The Sixth National Population Census of the People's Republic of China, also referred to as the 2010 Chinese Census, was conducted by the National Bureau of Statistics of the People's Republic of China in November 2010.The census data were used to check whether regular transit riders are distributed in the same pattern as resident distribution.The census data showed the residential population as 19,612,368, for 306 sub-districts (towns) in Beijing.Unfortunately, information on age, gender, or income was not available for the researchers.The 306 sub-districts have different sizes and shapes, as shown in Table 1.As the shapes and sizes of the 306 sub-districts are not identical, the mechanism used to handle differences is essential in the procedure.Data from bus stops and metro stations were also collected as public transit resources.By the end of 2016, there were about 1020 bus lines and 19 subway lines in Beijing, with more than 20,000 bus stops and 345 subway stations (source: www.bjbus.com,www.bjsubway.com),as shown in Figure 1.Information about the stops and stations contains names, types, longitudes, and latitudes was available.Metro transits and bus transits were treated as equal transits.Further, the bus stops are now listed by the line number.When a stop has two or more bus lines, the stops will be counted twice or more times.This is important, because the more bus lines are available in one stop, the more transit services are allocated there.Stops of the bus lines that connect Beijing to its surrounding towns were also used, while bus stops located outside of Beijing in the inter-city lines were ruled out.Generally, the location of Tiananmen is perceived as the city center.A circle within 10 km from the center is seen as the center area, and the near suburbs (periphery) are the belt 10~30 km from the center.Areas beyond 30 km are generally perceived as remote suburbs.Wang et al. (2011) argue that house prices decline from the center to the periphery in Beijing, and working class people are more likely choose to live in the near suburbs where they can find a balance between house prices and commuting efforts [12].

Location Maximum Likelihood Estimation (L-MLE)
The method of L-MLE is designed to identify locations of regular transit riders based on Torsten Hägerstrand's theories on capability constraints.Transit riders making only a single (once a day) journeys on public transit are not perceived as regular riders and have been discarded from this study.Bus transits and metro transit are integrated into transit diaries for all the transit riders.The regular return places of public transit riders at night are likely to be their private residences and can be recognized by analyzing the transit logs.The last station each rider uses and the first station on the next day are selected, and the distance calculated.The distance between the two stations is an overnight displacement, represented by λ.If λ < λ t , (λ t is the threshold distance individuals usually travel on foot), the two stations are perceived to be close to the rider's residential location.A technique of maximum likelihood estimation is used to calculate the riders' perceived locations based on their usual return places in transit logs.If a transit rider's night stays are random, the rider's location is not recognizable.Since existing research has proved public transit users seldom use transit stations beyond 2 km away [31], the threshold number λ t = 2 km in the case study.Only those who return to places within 2 km are perceived recognizable.The whole procedure is shown in Figure 2. Generally, the location of Tiananmen is perceived as the city center.A circle within 10 km from the center is seen as the center area, and the near suburbs (periphery) are the belt 10~30 km from the center.Areas beyond 30 km are generally perceived as remote suburbs.Wang et al. (2011) argue that house prices decline from the center to the periphery in Beijing, and working class people are more likely choose to live in the near suburbs where they can find a balance between house prices and commuting efforts [12].

Location Maximum Likelihood Estimation (L-MLE)
The method of L-MLE is designed to identify locations of regular transit riders based on Torsten Hägerstrand's theories on capability constraints.Transit riders making only a single (once a day) journeys on public transit are not perceived as regular riders and have been discarded from this study.Bus transits and metro transit are integrated into transit diaries for all the transit riders.The regular return places of public transit riders at night are likely to be their private residences and can be recognized by analyzing the transit logs.The last station each rider uses and the first station on the next day are selected, and the distance calculated.The distance between the two stations is an overnight displacement, represented by λ.If λ < λ t , (λ t is the threshold distance individuals usually travel on foot), the two stations are perceived to be close to the rider's residential location.A technique of maximum likelihood estimation is used to calculate the riders' perceived locations based on their usual return places in transit logs.If a transit rider's night stays are random, the rider's location is not recognizable.Since existing research has proved public transit users seldom use transit stations beyond 2 km away [31], the threshold number λ t = 2 km in the case study.Only those who return to places within 2 km are perceived recognizable.The whole procedure is shown in Figure 2.
L-MLE was used to estimate the rider's location [longi MLE , lati MLE ], as follows: where n is the total number of the midpoints indexed by i.

Cell Space Collector (CSC)
The mechanism of CSC is introduced to solve the inconsistency of data in type and format, as it hinders the research while comparing the spatial distributions of transit users and public transit resources.Data of residents' locations are based on polygon spaces, while data of bus stops and metro stops are based on points.Under the framework of CSC, the whole study area is sliced into many cell spaces, and the points are projected into these cell spaces.If a point falls into a cell space, the logical value is ω = 1; otherwise, ω = 0.Then, the number of points located in the cell space number k is added as follows: where i is the index number of points and k is the index number of cell spaces.Data of riders' locations are not polygon based; gridded spaces are used as well as administrative borders.Coordinates of the points are used as the basis of judging their locations when the study area is sliced into gridded cell spaces.For an area sliced into m by n gridded cell spaces, whether a point [longi, lati] is located in a cell space whose latitude is in the range of [lati MIN , lati MAX ] and longitude is in the range of [longi MIN , longi MAX ] can be judged by logical operation calculation, as follows： and longi > longi MIN where '&' means formulas on both the sides are true.
L-MLE was used to estimate the rider's location [longi MLE , lati MLE ], as follows: where n is the total number of the midpoints indexed by i.

Cell Space Collector (CSC)
The mechanism of CSC is introduced to solve the inconsistency of data in type and format, as it hinders the research while comparing the spatial distributions of transit users and public transit resources.Data of residents' locations are based on polygon spaces, while data of bus stops and metro stops are based on points.Under the framework of CSC, the whole study area is sliced into many cell spaces, and the points are projected into these cell spaces.If a point falls into a cell space, the logical value is ω = 1; otherwise, ω = 0.Then, the number of points located in the cell space number k is added as follows: where i is the index number of points and k is the index number of cell spaces.
Data of riders' locations are not polygon based; gridded spaces are used as well as administrative borders.Coordinates of the points are used as the basis of judging their locations when the study area is sliced into gridded cell spaces.For an area sliced into m by n gridded cell spaces, whether a point [longi, lati] is located in a cell space whose latitude is in the range of [lati MIN , lati MAX ] and longitude is in the range of [longi MIN , longi MAX ] can be judged by logical operation calculation, as follows: and longi where '&' means formulas on both the sides are true.
It should be noted that the grids are not perfect rectangles, since the earth is a sphere.If the cell grids are small, the error ε of the area of the grid can be approximated by where lati MAX and lati MIN are the maximum and minimum latitudes in the study area.Gridded cell spaces in latitude similar to Beijing with dimensions of 1 × 1 km has an error ε = 0.023,7 compared with the rectangles.Thus, grids based on latitudes and longitudes are treated as equal in size, especially in cities with low and medium latitudes.

The Recognition of Regular Riders
In the dataset of 10 days transit log (7-16 August), 31,062,866 transits remained after riders making single transit a day were discarded.For approximately one third (11,495,322) of the transit riders', overnight displacement was λ = 0 km, which means their last alighting stops of the day were exactly the next days' first boarding stops.When the histogram of the 19,567,544 records with λ ∈ (0, 5 km] was made, a peak was found at approximately λ = 0.8 km (Figure 3a).This meant if a rider could not make a transit at the last he/she used, he/she was likely to find another station approximately 800 m while making the first transit of the day.Approximately half of the transit records (18,005,108) fell into the zone of λ < 2 km, and these transits were made by 5,737,663 regular riders.As the M-LME method was applied, the locations of the 5,737,663 riders were recognizable.Figure 3b shows that many of the regular transit riders returned to their estimated places.78.0% of the regular transit riders were found to make their first transits from their estimated places before 09:00, and approximately 3.3 million of them made night returns (after 18:00) more than once.

ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 6 of 11
It should be noted that the grids are not perfect rectangles, since the earth is a sphere.If the cell grids are small, the error  of the area of the grid can be approximated by where lati MAX and lati MIN are the maximum and minimum latitudes in the study area.Gridded cell spaces in latitude similar to Beijing with dimensions of 1 × 1 km has an error  = 0.023,7 compared with the rectangles.Thus, grids based on latitudes and longitudes are treated as equal in size, especially in cities with low and medium latitudes.

The Recognition of Regular Riders
In the dataset of 10 days transit log (August 7 th -16 th ), 31,062,866 transits remained after riders making single transit a day were discarded.For approximately one third (11,495,322) of the transit riders', overnight displacement was λ = 0 km, which means their last alighting stops of the day were exactly the next days' first boarding stops.When the histogram of the 19,567,544 records with λ ∈ (0, 5 km] was made, a peak was found at approximately λ = 0.8 km (Figure 3a).This meant if a rider could not make a transit at the last station he/she used, he/she was likely to find another station approximately 800 m while making the first transit of the day.Approximately half of the transit records (18,005,108) fell into the zone of λ < 2 km, and these transits were made by 5,737,663 regular riders.As the M-LME method was applied, the locations of the 5,737,663 riders were recognizable.Figure 3b shows that many of the regular transit riders returned to their estimated places.78.0% of the regular transit riders were found to make their first transits from their estimated places before 09:00, and approximately 3.3 million of them made night returns (after 18:00) more than once.The regular transit riders also accounted for a good proportion of everyday transit users.During a week-long period from August 7th to 13th, 66.5%-77.5% in the 5.37 million regular transit riders were daily riders (Table 2).This implies that the identified regular transit riders are mainly active daily transit users.The regular transit riders also accounted for a good proportion of everyday transit users.During a week-long period from August 7th to 13th, 66.5%-77.5% in the 5.37 million regular transit riders were daily riders (Table 2).This implies that the identified regular transit riders are mainly active daily transit users.When the CSC was applied using sub-district borders, all the regular transit riders' locations were projected into 306 cell zones same as Beijing's 306 sub-districts.Ratios of the regular transit riders to residents were calculated in each cell zone, and the results were found to be different among sub-districts (Figure 4).The result shows it is unfair to use residential distribution as the basis of transit service planning.Ratios of the regular transit riders to residents are lower in the center areas than the near-suburban areas, contrary to our general intuition.This result obtained by data mining is consistent with the survey data used by Wang et al. (2011) [18].Further examination is needed to explore whether the existing transit services fit the demands well.

Spatial Differences of Regular Rider Proportion to Resident Population
When the CSC was applied using sub-district borders, all the regular transit riders' locations were projected into 306 cell zones same as Beijing's 306 sub-districts.Ratios of the regular transit riders to residents were calculated in each cell zone, and the results were found to be different among sub-districts (Figure 4).The result shows it is unfair to use residential distribution as the basis of transit service planning.Ratios of the regular transit riders to residents are lower in the center areas than the near-suburban areas, contrary to our general intuition.This result obtained by data mining is consistent with the survey data used by Wang et al. (2011) [18].Further examination is needed to explore whether the existing transit services fit the demands well.As the spatial distribution of the regular riders was visualized in a closer view with the help of gridded CSC, the overall trend of regular transit riders' numbers in cell zones did not decline straightly from the center to the periphery.Most riders were found distributed 5-20 km from the city center (Figure 5), and the top three cells of the sampling fishnets were approximately 20 km away from the city center.

Uncovering of Mismatched Public Transit Services
All the bus stops and metro stations were also projected in the same gridded cell zones as the regular transit riders as shown in Figure 6a.A declining trend was observed from center to periphery mode in Figure 6b.The distribution patterns of regular transit riders and public transit stations signified a distinct mismatch between residents' transit needs (Figure 5b) and public transit resources (Figure 6b).Two clear mismatches of regular riders and transit resources were found.The first region was located in the north of Beijing (20 km from the center), known as Tiantongyuan-Huilongguan Residential Area (near [E116.5,N40.07]), with regular transit riders being highly distributed, whereas the amount of transit service resources was small.The second region was the central areas of Beijing, in a radius of approximately 10 km, with transit resources being highly distributed and few regular riders living there.Ma et al. (2017) conducted a research on commuting patterns of Beijing transit users, and they found that Tiantongyuan-Huilongguan and another inner-city place named Fangzhuang (near [E116.4,N39.85]) had most commuting transit users [10].However, they did not give further information about the condition of the transit service resources in these places.This study shows Tiantongyuan-Huilongguan has less public transport resources than Fangzhuang, and the former area needs be strengthened in transit service.
This mismatch is detailed in a bubble graph (Figure 7).The largest bubble represents a zone that has 220 thousand regular transit riders, and its position shows its distance to the city center and the ratio of regular transit riders to residents.In the heart of the city (within 5 km to the center), regular

Uncovering of Mismatched Public Transit Services
All the bus stops and metro stations were also projected in the same gridded cell zones as the regular transit riders as shown in Figure 6a.A declining trend was observed from center to periphery mode in Figure 6b.

Uncovering of Mismatched Public Transit Services
All the bus stops and metro stations were also projected in the same gridded cell zones as the regular transit riders as shown in Figure 6a.A declining trend was observed from center to periphery mode in Figure 6b.The distribution patterns of regular transit riders and public transit stations signified a distinct mismatch between residents' transit needs (Figure 5b) and public transit resources (Figure 6b).Two clear mismatches of regular riders and transit resources were found.The first region was located in the north of Beijing (20 km from the center), known as Tiantongyuan-Huilongguan Residential Area (near [E116.5,N40.07]), with regular transit riders being highly distributed, whereas the amount of transit service resources was small.The second region was the central areas of Beijing, in a radius of approximately 10 km, with transit resources being highly distributed and few regular riders living there.Ma et al. (2017) conducted a research on commuting patterns of Beijing transit users, and they found that Tiantongyuan-Huilongguan and another inner-city place named Fangzhuang (near [E116.4,N39.85]) had most commuting transit users [10].However, they did not give further information about the condition of the transit service resources in these places.This study shows Tiantongyuan-Huilongguan has less public transport resources than Fangzhuang, and the former area needs be strengthened in transit service.
This mismatch is detailed in a bubble graph (Figure 7).The largest bubble represents a zone that has 220 thousand regular transit riders, and its position shows its distance to the city center and the ratio of regular transit riders to residents.In the heart of the city (within 5 km to the center), regular The distribution patterns of regular transit riders and public transit stations signified a distinct mismatch between residents' transit needs (Figure 5b) and public transit resources (Figure 6b).Two clear mismatches of regular riders and transit resources were found.The first region was located in the north of Beijing (20 km from the center), known as Tiantongyuan-Huilongguan Residential Area (near [E116.5,N40.07]), with regular transit riders being highly distributed, whereas the amount of transit service resources was small.The second region was the central areas of Beijing, in a radius of approximately 10 km, with transit resources being highly distributed and few regular riders living there.Ma et al. (2017) conducted a research on commuting patterns of Beijing transit users, and they found that Tiantongyuan-Huilongguan and another inner-city place named Fangzhuang (near [E116.4,N39.85]) had most commuting transit users [10].However, they did not give further information about the condition of the transit service resources in these places.This study shows Tiantongyuan-Huilongguan has less public transport resources than Fangzhuang, and the former area needs be strengthened in transit service.
This mismatch is detailed in a bubble graph (Figure 7).The largest bubble represents a zone that has 220 thousand regular transit riders, and its position shows its distance to the city center and the ratio of regular transit riders to residents.In the heart of the city (within 5 km to the center), regular riders were smaller in number and ration than residents compared to the areas 5-20 km from the city center.Thus, in the most resource distributed heart areas, residents' willingness to use public transit was relatively low.This work was unable to uncover reasons behind the phenomenon, but it is unreasonable to have the best-facilitated place with a relatively small number of regular users.
ISPRS Int.J. Geo-Inf.2019, 8, x FOR PEER REVIEW 9 of 11 riders were smaller in number and ration than residents compared to the areas 5-20 km from the city center.Thus, in the most resource distributed heart areas, residents' willingness to use public transit was relatively low.This work was unable to uncover reasons behind the phenomenon, but it is unreasonable to have the best-facilitated place with a relatively small number of regular users.

Conclusions
The method of L-MLE may serve as a useful solution to recognize regular transit users and their locations based on data mining.The mechanism of CSC is helpful in locating places where transit services mismatch regular users.The two tools can generate up-to-date demand analyses for researchers when transit data permit.
The proportions of residents who use public transit systems differ among sub-districts of Beijing.Over 5.37 million regular transit riders were recognized by the L-LME method, accounting for 74.5% of daily transit users in a week.Transit demands generated by regular transit riders require more attention.
The number of regular transit riders is large in the near-suburban areas (5-20 km to the center).In remote suburbs (> 30 km to the center), fewer residents are regular transit riders.In the heart of the city (< 5 km to the center), regular transit riders are not as high as in near-suburban areas.
Based on the mechanism of CSC, the spatial distribution of public transit resources declines straightly from the center to the periphery.This trend mismatches the distribution of regular transit riders in Beijing.

Implications for Transportation Policies and Limitations
In the near-suburban areas of Beijing, since both the number and proportion of regular transit riders are high, more attention should be paid to the problems of overload, resource shortage, and service shortage in the public transit system.The total number of regular transit riders in the remote suburbs of Beijing is small, constituting a small proportion of the total resident population.Therefore, a blunt increase of public transit investment should be avoided in the remote suburbs.The central areas of Beijing already have an advantage in public transit resources, so planners need to be cautious of making repeat investments in the central areas.

Conclusions
The method of L-MLE may serve as a useful solution to recognize regular transit users and their locations based on data mining.The mechanism of CSC is helpful in locating places where transit services mismatch regular users.The two tools can generate up-to-date demand analyses for researchers when transit data permit.
The proportions of residents who use public transit systems differ among sub-districts of Beijing.Over 5.37 million regular transit riders were recognized by the L-LME method, accounting for 74.5% of daily transit users in a week.Transit demands generated by regular transit riders require more attention.
The number of regular transit riders is large in the near-suburban areas (5-20 km to the center).In remote suburbs (> 30 km to the center), fewer residents are regular transit riders.In the heart of the city (< 5 km to the center), regular transit riders are not as high as in near-suburban areas.
Based on the mechanism of CSC, the spatial distribution of public transit resources declines straightly from the center to the periphery.This trend mismatches the distribution of regular transit riders in Beijing.

Implications for Transportation Policies and Limitations
In the near-suburban areas of Beijing, since both the number and proportion of regular transit riders are high, more attention should be paid to the problems of overload, resource shortage, and service shortage in the public transit system.The total number of regular transit riders in the remote suburbs of Beijing is small, constituting a small proportion of the total resident population.Therefore, a blunt increase of public transit investment should be avoided in the remote suburbs.The central areas of Beijing already have an advantage in public transit resources, so planners need to be cautious of making repeat investments in the central areas.
Data mining of transit data may provide fast analysis in transport management.However, some limitations to this approach also exist.Although the distribution of metro stations was found to be similar to that of bus stops, the differences in service ability between bus stops and metro stations were not considered.Only the transit demands generated by regular transit riders from their residential perspective are discussed, while the demands generated by their workplaces have not been addressed.Transit riders may choose to move to places with better transit service resources and affordable house prices, so decisions of transit resource allocation should not be simply made based on the mismatch condition.Further study is needed to improve the approach taken, and to address the causes of the aforementioned mismatches.

11 Figure 1 .
Figure 1.Distribution of bus stops and metro stations in Beijing.

Figure 1 .
Figure 1.Distribution of bus stops and metro stations in Beijing.

11 Figure 2 .
Figure 2. The procedure of discerning regular transit riders.

Figure 2 .
Figure 2. The procedure of discerning regular transit riders.The geographic coordinates of last alighting station on a day is [longi a , lati a ] for a certain rider, and the first boarding station coordinate on the next day is [longi b , lati b ]. [longi i , lati i ] is the midpoint, as follows: [longi i , lati i ] = ([longi b , lati b ] + [longi a , lati a ])/2, (1)

Figure 3 .
Figure 3. (a) Overnight displacements and (b) night return times of regular riders.

Figure 3 .
Figure 3. (a) Overnight displacements and (b) night return times of regular riders.

Figure 4 .
Figure 4. Spatial differences of regular transit rider ratios.

Figure 4 .
Figure 4. Spatial differences of regular transit rider ratios.CSC based on sub-districts are limited, because the sub-districts are not in the same sizes and shapes.A gridded CSC was conducted to collect the regular riders with parameters m = 50, n = 50, [lati MIN , lati MAX ] = [39.65,40.25], and [longi MIN , longi MAX ] = [116.1,116.75].As the spatial distribution of the regular riders was visualized in a closer view with the help of gridded CSC, the overall trend of regular transit riders' numbers in cell zones did not decline straightly from the center to the periphery.Most riders were found distributed 5-20 km from the city center (Figure5), and the top three cells of the sampling fishnets were approximately 20 km away from the city center.

Figure 7 .
Figure 7. Center-periphery distribution of regular transit riders in number and ratios.

Figure 7 .
Figure 7. Center-periphery distribution of regular transit riders in number and ratios.

Author Contributions:
Conceptualization, Haitao Jin and Fengjun Jin; Methodology and Technical Realization, He Zhu and Haitao Jin; Writing, Haitao Jin and He Zhu.

Funding:
The work in this paper was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA19040402, XDA19040403) and National Natural Science Foundation of China (No. 41771134).

Table 1 .
Information of 306 sub-districts in Beijing.

Table 2 .
Coverage of identified regular transit riders.

Table 2 .
Coverage of identified regular transit riders.