Comparable Measures of Accessibility to Public Transport Using the General Transit Feed Specification

Public transport plays a critical role in the sustainability of urban settings. The mass mobility and quality of urban lives can be improved by establishing public transport networks that are accessible to pedestrians within a reasonable walking distance. Accessibility to public transport is characterized by the ease with which inhabitants can reach means of transportation such as buses or metros. By measuring the degree of accessibility to public transport networks using a common data format, a comparative study can be conducted between different cities or metropolitan areas with different public transit systems. The General Transit Feed Specification (GTFS) by Google Developers allows this by offering a common format based on text files and sharing the data set voluntarily produced and contributed by the public transit agencies of many participating cities around the world. This paper suggests a method to assess and compare public transit accessibility in different urban areas using the GTFS feed and demographic data. To demonstrate the value of the new method, six examples of metropolitan areas and their public transit accessibility are presented and compared.


Introduction
Public transport in urban areas has gained greater attention in recent years for improving sustainability and the quality of urban life.The economic and environmental performance of cities can be enhanced by connecting resources to destinations effectively and facilitating mass mobility.Public transport networks are an important part of urban structure that enables other elements of urban interactions and circulations to be developed.Thus, monitoring and assessing public transit can be crucial for understanding an urban system and its major functions.
Public transport is one component of the modal split that includes walking, cycling, and private vehicles [1].In contrast to private transport modes, public transit is considered to be open to a wider user group and is operated by public authorities or agents.Besides mass mobility, unique characteristics such as routes and schedules, locations of users, and the times of day when trips are made distinguish public transit from other modes of transportation and influence accessibility [2].To measure accessibility, this study divides public transport into two broader groups: rail-based transit (metros or trains) and road-based transit (buses or trams).
The concept of accessibility can generally be defined as the ease with which inhabitants can reach their destinations [3].However, similar but separate notions have been used in the last 40 years of transportation research [4,5].Terminological differences such as "access," "accessibility," "availability," and "proximity" have been used in various studies with similar meaning [6][7][8].A number of measures of accessibility have been proposed [9][10][11][12][13][14], but the measurements vary with the researchers' perspectives and objectives [2,15,16].While traditional studies on transportation accessibility have focused on transit as a means of reaching other urban services and areas of activities, recent studies define public transport as a key urban service to be accessed [17].Therefore, the expression of accessibility to public transport rather than accessibility of public transport should be noted.
This study focuses on the temporal dimension of public transport as well as spatial aspect.The spatial factor of accessibility is identified as the specific distance from each node of transit.Physical distance to a node on a route is the most basic component of accessibility to public transport, as public transport needs to be physically reachable for a large number of inhabitants [18].However, the proximity of users to transit stops is not sufficient to define accessibility; when and how frequently vehicles can be accessed are also decisive factors [6].In terms of the modal split of public transport, both the peak service volume and average service volume per day have generally been used.Operational frequency has usually been measured as trips per hour by taking the average of scheduled headways along with the routes [19].
The most recent analytical studies focus on taking into account "time" in the geographic information science, transportation, and geography fields.With the innovation and implementation of information and transit technologies, the conventional geographic concepts and measuring methodologies have become insufficient in representing the real-time environment [20].Integration of time and space by capturing individual mobility in everyday life has changed aggregated place-based perspectives into people-based perspectives, owing to the use of information and communication technology (ICT), location-aware technology (LAT), global positioning system (GPS), radiofrequency identification (RFID), and location-based services (LBS) [21][22][23].
On the other hand, other recent work from the perspective of large-scale public transit applications develops the integrated measuring concept of "connectivity" for assessing performance of multimodal and multilevel transit networks [24][25][26].Associated attributes including nodes, lines, transfers, schedules, and spatial activity patterns are quantified as an indicator for decision-making and the allocation of funding in the most efficient way.
This study aims to explore an internationally comparable methodology for measuring accessibility to public transport with interoperable data sources and compatible spatial units for analysis that can be applied globally.The increasing requirements for global collaboration towards more sustainable urban structures and environments can be met by establishing common parameters for analysis among various cities and metropolitan areas with different dimensions and contexts.Achieving sustainable urban form and improving regional well-being are key challenges for current urban and regional policies, and public transport plays a vital role in addressing these issues.
In Section 2, we highlight the importance of a new transit data source and illustrate a framework to measure public transport accessibility.In Section 3, application of the method to one example of an Asian metropolitan area and five North American areas is presented.Finally, we conclude with a short summary and insights for future works.

Data and Framework
Many research efforts explore the importance of precise estimation of population related to public transit and enhance analysis methods.Choice of representation of proper transit elements and levels of population data aggregation that overlap considerably tend to alter the final results [27][28][29].Although effects of scale and spatial representation issues can be addressed with new methods [28,29], as long as researchers apply census data (i.e., residential population distribution, not actual daytime population), there is the limitation of misrepresentation of actual population and over-or under-estimation of demand.
General Transit Feed Specification (GTFS) has been used in recent transit research since Google structured and launched the open platform in 2008.Despite the limitation of spatial coverage of worldwide GTFS data, especially in the Global South [30], these standardized transit data have great potential as a source for efficient spatiotemporal transit analyses by using scripts and database queries [31], especially for comparing research between various cities.Shortest path routes along the transit network in the study area and travel times between specific nodes can be estimated by linking GTFS data files together with the ArcGIS Network Analyst Tools [32].
A problem with residential population arises because people leave their census boundary units during the day, a spatial and temporal regularity [33].This study, which aims to compare accessibility to public transportation in metropolitan areas around the world, not only in the US, uses an alternative concept and data for the population, the ambient population, in order to address this problem as well as the availability of population data across the world.
The calculation of the public transport accessibility indicator for metropolitan areas consists of four steps: (1) data collection; (2) identification of service areas; (3) classification of accessibility levels; and (4) calculation of public transit coverage in terms of population.In addition, a common spatial unit of analysis is needed for different dimensional metropolitan areas before providing a methodology for measuring accessibility.This study applies Functional Urban Areas (FUAs), which were defined in 2012 by the Organization for Economic Cooperation and Development (OECD) in collaboration with the European Union (EU) for the purpose of international comparison.

Spatial Unit of Analysis-Functional Urban Areas (FUAs)
Demand is growing for a common base to understand and assess urban areas to ensure international comparability in OECD works.Meeting this demand involves accounting for administrative boundaries, the continuity of built-up areas, functional measures such as commuting rates, and the size of components to be aggregated.Recently, 1179 FUAs from 29 OECD countries were defined as urban spatial units including urban cores and hinterlands by the OECD in cooperation with the EU (Eurostat and European Commission Directorates General (EC-DG) Regio) (Figure 1) [34,35].Figure 2 and Table 1 show six cases of FUA maps and profiles with public transit [36,37].Both the size of the overall metropolitan area and population vary considerably across the FUAs.General Transit Feed Specification (GTFS) has been used in recent transit research since Google structured and launched the open platform in 2008.Despite the limitation of spatial coverage of worldwide GTFS data, especially in the Global South [30], these standardized transit data have great potential as a source for efficient spatiotemporal transit analyses by using scripts and database queries [31], especially for comparing research between various cities.Shortest path routes along the transit network in the study area and travel times between specific nodes can be estimated by linking GTFS data files together with the ArcGIS Network Analyst Tools [32].
A problem with residential population arises because people leave their census boundary units during the day, a spatial and temporal regularity [33].This study, which aims to compare accessibility to public transportation in metropolitan areas around the world, not only in the US, uses an alternative concept and data for the population, the ambient population, in order to address this problem as well as the availability of population data across the world.
The calculation of the public transport accessibility indicator for metropolitan areas consists of four steps: (1) data collection; (2) identification of service areas; (3) classification of accessibility levels; and (4) calculation of public transit coverage in terms of population.In addition, a common spatial unit of analysis is needed for different dimensional metropolitan areas before providing a methodology for measuring accessibility.This study applies Functional Urban Areas (FUAs), which were defined in 2012 by the Organization for Economic Cooperation and Development (OECD) in collaboration with the European Union (EU) for the purpose of international comparison.

Spatial Unit of Analysis-Functional Urban Areas (FUAs)
Demand is growing for a common base to understand and assess urban areas to ensure international comparability in OECD works.Meeting this demand involves accounting for administrative boundaries, the continuity of built-up areas, functional measures such as commuting rates, and the size of components to be aggregated.Recently, 1179 FUAs from 29 OECD countries were defined as urban spatial units including urban cores and hinterlands by the OECD in cooperation with the EU (Eurostat and European Commission Directorates General (EC-DG) Regio) (Figure 1) [34,35].Figure 2 and Table 1 show six cases of FUA maps and profiles with public transit [36,37].Both the size of the overall metropolitan area and population vary considerably across the FUAs.

Data Collection-General Transit Feed Specification (GTFS)
The General Transit Feed Specification (GTFS) by Google Developers facilitates the publishing and sharing of public transit information of various cities and metropolitan areas by defining a common format.GTFS consists of a series of text files containing specific aspects of public transport data: stops, routes, trips, and other schedule information (Table 2) [38].The GTFS feed is voluntarily produced and updated on the Internet by the public transport agencies of cities and municipalities around the world.Since first being launched in 2008, over 900 transit agencies have shared their public transportation feed in this interoperable way.While the major contributors are North American cities, there are growing numbers of European and Asian cities participating in the program (Table 3) [39].In this study, transit data from Chicago, Portland, Washington, Vancouver, and Toronto were obtained through the GTFS platform.Daejeon had not yet been included in the list, so data for these FUAs were retrieved through an additional search.Although GTFS feeds do not necessarily contain all public transport data for certain cities or metropolitan areas, efficient availability of data with a common format enables comparative analysis of a number of cases.

Identification of Areas Serviced by Public Transit
A database file, where six standard GTFS files are imported and dynamically linked together with queries using Microsoft ACCESS 2010, has been created in order to generate an output table that contains, for every stop of every transit mode and for one specific date (8 October 2013), the frequency of departures for each hour.
All stops then have been imported as point features containing the average number of departures per hour during working-day hours (from 6 a.m. to 8 p.m.) along the pedestrian network.The stops located within 50 m from another stop were considered as one single cluster of stops with an aggregated hourly number of departures and new geographical mean center coordinates to avoid underestimating the effects of nearby stops on frequency.
Areas serviced by public transport are created by calculating pedestrian walking time through the Network Analyst Tools of ArcGIS 9.3 using the defined thresholds of 5 min of walking time for accessing road-based transit and 10 min of walking time for accessing rail-based transit (Figure 3).The walking speed applied at this stage is 1.1 m/s (almost 4 km/h), resulting in a 330 m network distance to road-based transit nodes and a 660 m network distance to rail-based transit nodes.

Identification of Areas Serviced by Public Transit
A database file, where six standard GTFS files are imported and dynamically linked together with queries using Microsoft ACCESS 2010, has been created in order to generate an output table that contains, for every stop of every transit mode and for one specific date (8 October 2013), the frequency of departures for each hour.
All stops then have been imported as point features containing the average number of departures per hour during working-day hours (from 6 a.m. to 8 p.m.) along the pedestrian network.The stops located within 50 m from another stop were considered as one single cluster of stops with an aggregated hourly number of departures and new geographical mean center coordinates to avoid underestimating the effects of nearby stops on frequency.
Areas serviced by public transport are created by calculating pedestrian walking time through the Network Analyst Tools of ArcGIS 9.3 using the defined thresholds of 5 min of walking time for accessing road-based transit and 10 min of walking time for accessing rail-based transit (Figure 3).The walking speed applied at this stage is 1.1 m/s (almost 4 km/h), resulting in a 330 m network distance to road-based transit nodes and a 660 m network distance to rail-based transit nodes.

Classification of Serviced Area Accessibility
Frequency was added as a temporal component for combined assessment after defining the serviced areas of public transport.Considering the road-based and rail-based modes, there are two steps to finalize mode-integrated classification for each metropolitan area: (1) classifying each serviced area for each mode by frequency data and (2) reclassification by aggregating mode-specific classes.
Frequency levels are determined by using the average amount of public transport service headways per hour.The mode-specific frequency classes have been defined as follows and applied for both modes (Table 4).Aggregated accessibility levels of public transport are divided into five classes (Table 5).Only the combination featuring high rail-based and road-based public transport results in a very high level of accessibility, while other combinations result in high, medium, low, and no access levels of accessibility.

Classification of Serviced Area Accessibility
Frequency was added as a temporal component for combined assessment after defining the serviced areas of public transport.Considering the road-based and rail-based modes, there are two steps to finalize mode-integrated classification for each metropolitan area: (1) classifying each serviced area for each mode by frequency data and (2) reclassification by aggregating mode-specific classes.
Frequency levels are determined by using the average amount of public transport service headways per hour.The mode-specific frequency classes have been defined as follows and applied for both modes (Table 4).Aggregated accessibility levels of public transport are divided into five classes (Table 5).Only the combination featuring high rail-based and road-based public transport results in a very high level of accessibility, while other combinations result in high, medium, low, and no access levels of accessibility.

Calculating the Population Share in Transit Service Areas
Ambient population data by Oak Ridge National Laboratory are measured for approximately one square kilometer, depending on the location's distance from the Equator.The expected population, being a 24 h estimate, is an average for the day, regardless of the time of year, since calculations incorporate both diurnal and seasonal population movements.LandScan 2009 was used for the estimation of population-accessible transit in this study.Population in the catchment areas of each accessibility level can be calculated by overlapping the service areas layer and ambient population layers on the assumption of evenly distributed ambient population in a cell.
Comparison across metropolitan areas can be performed by calculating the shares of population living in catchment areas of each accessibility level.Six FUAs were selected as case areas: Daejeon (Korea); Chicago, Portland, and Washington (U.S.); and Toronto and Vancouver (Canada).The serviced population shares in both FUA areas and urban core areas were calculated according to the different service levels (ranging from very high to no service).

Results and Discussion
Table 6 shows the number of transit stops of all modes obtained by GTFS data as of November 2013 as well as populations and areas of six different FUAs.As it was introduced earlier, FUAs consist of both urban core and hinterlands, which are identified as major job provider/functional core areas and labor supply/residential areas, respectively.Buses are found to be the most highly accessible transit mode due to the fact that bus stops heavily outnumber those of any other transit modes in every FUA.The Portland FUA has the lowest density as well as the lowest number of people per each bus stop among all FUAs, while the Chicago FUA shows the highest density of the North American FUAs, except the Daejeon FUA has the highest density of all.Figure 4 provides the geographic distribution of public transport catchment areas according to the accessibility levels in urban cores of the six FUAs.As few public transit stops and stations in the hinterlands of each FUA were observed, only those in the urban core are displayed.Even within urban cores, bus and tram stops are found to be more closely concentrated in the central areas of the urban cores.On the other hand, commuting train and metro stations are sparsely distributed in linear patterns towards the outskirts.These differences of spatial distribution according to transit modes are reflected on access time taken by riders on foot: 5 min for road-based transit and 10 min for rail-based transit.
Service areas, regardless of their levels of accessibility, refer to the areas where the estimated walking time for accessing public transport is no longer than 5 min for road-based transit and 10 min for rail-based transit.The grey areas, on the other hand, refer to areas where the estimated walking time for accessing public transport is longer than 5 min and 10 min, respectively.The service frequency of public transport was classified into the categories of very high, high, medium, and low.Among the four accessibility levels, the very high level service areas (navy areas) refer to the areas where a passenger who just missed a bus or train can take the next one of either transit modes within the next 6 min.Secondly, the high level service areas (dark green areas) represent the areas where the riders can take only one of the two transit modes within the next 6 min.Thirdly, the areas with the medium level of accessibility (light green areas) show service areas where the rider can take at least one of the two transit modes in 6 to 15 min.Lastly, the low level service areas (yellow areas) refer to the areas where the rider needs to wait for longer than 15 min for both transit modes, or where the rider cannot reach any nodes of either transit modes within 5 min and 10 min, respectively.
All six results show the highest levels of access are concentrated around central nodes in urban core areas, with lower levels radiating towards the margins of the urban cores and spreading sparsely into the hinterlands.Serviced areas are mostly located within the urban core areas.Despite considerable variation, a certain distribution pattern of larger populations and higher levels in urban cores is observed in all assessed areas.In comparison to Asian metropolitan areas, North American cities illustrate more intensively concentrated catchment areas in a smaller portion of areas within urban cores, which seems to correspond with the relatively extensive land use pattern and high dependency on automobiles.
the accessibility levels in urban cores of the six FUAs.As few public transit stops and stations in the hinterlands of each FUA were observed, only those in the urban core are displayed.Even within urban cores, bus and tram stops are found to be more closely concentrated in the central areas of the urban cores.On the other hand, commuting train and metro stations are sparsely distributed in linear patterns towards the outskirts.These differences of spatial distribution according to transit modes are reflected on access time taken by riders on foot: 5 min for road-based transit and 10 min for rail-based transit.The population share in serviced areas more clearly reveals the accessibility to public transport across the case regions (Figure 5).Daejeon has the largest share of population with high accessibility among the six FUAs.A relatively smaller area of the Daejeon FUA compared to the North American FUAs corresponds with a higher share of area of public transit catchments, with 22% of the whole Daejeon FUA area and 30% of the urban core area (Table 1 and Figure 6).Nevertheless, excessive differences between shares of population and area of each FUA reveal a centralized urban structure and transit networks of metropolitan areas.Except for Portland among the five North American FUAs, others show that around 70% of the population has no access to public transport in the metropolitan areas.
FUAs corresponds with a higher share of area of public transit catchments, with 22% of the whole Daejeon FUA area and 30% of the urban core area (Table 1 and Figure 6).Nevertheless, excessive differences between shares of population and area of each FUA reveal a centralized urban structure and transit networks of metropolitan areas.Except for Portland among the five North American FUAs, others show that around 70% of the population has no access to public transport in the metropolitan areas.

Conclusions
Ensuring public transport accessibility is an important task for sustainable urban development.This study focused on conceptualizing an integrated indicator and providing a viable methodology and data sources for measuring and comparing the accessibility to public transport for metropolitan areas.The proposed method applies spatial and temporal dimensions of accessibility based on a common spatial unit of analysis for comparison of the assessment results.
The methodology in this study differs from that of other studies in that it integrates several transit modes, including both on roads and on rails rather than only bus, and temporally classifies the service areas with actual headways of each mode and stop on a specific date.For the estimation of ridership, it introduces a novel concept and data, that of ambient population instead of census data in order to represent the actual daytime population distribution.
The results indicate how many people living in FUAs have relatively easy access to public transport in terms of both physical access and the level of service frequency provided.In comparison with another 14 European metropolitan area cases [40], Figure 7 shows that a larger share of the population in urban core areas of European metropolitan areas has access to public transit compared to North American areas; above 70% of the population in the European cities has some access to public transport.There was a striking difference between European and non-European cities, especially cities in North America.The general pattern shows larger population shares being serviced in urban cores, which also have higher levels of service frequency compared to the FUAs overall.These trends confirm the expectations in public transport development and land use patterns that more densely inhabited urban areas would tend to feature better public transport than the hinterlands.

Conclusions
Ensuring public transport accessibility is an important task for sustainable urban development.This study focused on conceptualizing an integrated indicator and providing a viable methodology and data sources for measuring and comparing the accessibility to public transport for metropolitan areas.The proposed method applies spatial and temporal dimensions of accessibility based on a common spatial unit of analysis for comparison of the assessment results.
The methodology in this study differs from that of other studies in that it integrates several transit modes, including both on roads and on rails rather than only bus, and temporally classifies the service areas with actual headways of each mode and stop on a specific date.For the estimation of ridership, it introduces a novel concept and data, that of ambient population instead of census data in order to represent the actual daytime population distribution.
The results indicate how many people living in FUAs have relatively easy access to public transport in terms of both physical access and the level of service frequency provided.In comparison with another 14 European metropolitan area cases [40], Figure 7 shows that a larger share of the population in urban core areas of European metropolitan areas has access to public transit compared to North American areas; above 70% of the population in the European cities has some access to public transport.There was a striking difference between European and non-European cities, especially cities in North America.The general pattern shows larger population shares being serviced in urban cores, which also have higher levels of service frequency compared to the FUAs overall.These trends confirm the expectations in public transport development and land use patterns that more densely inhabited urban areas would tend to feature better public transport than the hinterlands.
Given that the major purpose of a comparative study is to provide insights for policy implementation and decision-making, the behavioral insights from this work are: the sustainable expansion of metropolitan areas in terms of efficient supply and management of public transport.
Especially in North American urban areas, securing the connectivity of transport to be independent of personal vehicles could be challenging.Further research for optimal scale and land use patterns of metropolitan areas under various urban conditions from the perspectives of economies, environment and quality of inhabitants is required.
transport in terms of both physical access and the level of service frequency provided.In comparison with another 14 European metropolitan area cases [40], Figure 7 shows that a larger share of the population in urban core areas of European metropolitan areas has access to public transit compared to North American areas; above 70% of the population in the European cities has some access to public transport.There was a striking difference between European and non-European cities, especially cities in North America.The general pattern shows larger population shares being serviced in urban cores, which also have higher levels of service frequency compared to the FUAs overall.These trends confirm the expectations in public transport development and land use patterns that more densely inhabited urban areas would tend to feature better public transport than the hinterlands.While the findings of this study are encouraging, there are limitations to be addressed in the following research.First, the coverage of GTFS data across the globe has been increasing, yet the spatial coverage still concentrates on North America and Europe.As spatial scale and patterns of metropolitan areas among North America, Europe, and Asia are different, a wider and deeper comparison can be conducted with provision of GTFS data from Asian transit authorities later.Second, even though this study uses ambient population, there is still an issue of aggregate data and the assumption that the population in a cell is evenly distributed.This could be addressed by new data collection methods such as people-based tracking with ICT devices.

Figure 3 .
Figure 3. Creating areas served by public transit nodes.Figure 3. Creating areas served by public transit nodes.

Figure 3 .
Figure 3. Creating areas served by public transit nodes.Figure 3. Creating areas served by public transit nodes.

Figure 4 .
Figure 4. Serviced areas by accessibility levels.Figure 4. Serviced areas by accessibility levels.

Figure 4 .
Figure 4. Serviced areas by accessibility levels.Figure 4. Serviced areas by accessibility levels.

Figure 5 .
Figure 5. Share of population with access to public transport in FUAs.

Figure 6 .
Figure 6.Share of area with access to public transport in FUAs.

Figure 6 .
Figure 6.Share of area with access to public transport in FUAs.

Figure 7 .
Figure 7. Share of population with access to public transport in urban core areas.Source: The OECD; EU, Access to Public Transport: Comparing a Selection of Medium-Sized and Large Cities; OECD Territorial Development Policy Committee: Paris, France, 2014.

Figure 7 .
Figure 7. Share of population with access to public transport in urban core areas.Source: The OECD; EU, Access to Public Transport: Comparing a Selection of Medium-Sized and Large Cities; OECD Territorial Development Policy Committee: Paris, France, 2014.
Source: US Census, Daejeon Metropolitan City.

Table 2 .
GTFS files and their contents.

Table 3 .
Participating agencies by continent.

Table 4 .
Classification of service areas by frequency.

Table 5 .
Aggregated reclassification of service areas by frequency.
Share of population with access to public transport in FUAs.