A Clustering Approach for Analyzing Access to Public Transportation and Destinations

Shafiq, Mudassar; Rocha, Hudyeron; Couto, António; Ferreira, Sara

doi:10.3390/su16166944

Open AccessArticle

A Clustering Approach for Analyzing Access to Public Transportation and Destinations

CITTA—Centro de Investigação do Territórios dos Transportes e Ambiente, Faculdade de Engenharia da Universidade do Porto, 4200-465 Porto, Portugal

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(16), 6944; https://doi.org/10.3390/su16166944

Submission received: 17 May 2024 / Revised: 5 August 2024 / Accepted: 10 August 2024 / Published: 13 August 2024

(This article belongs to the Special Issue Sustainable Urban Transport Planning)

Download

Browse Figures

Versions Notes

Abstract

Promoting sustainable and equitable public transportation services is essential for addressing disparities and preventing social exclusion among diverse population groups for daily activities. This paper proposes a comprehensive approach to assess transport disadvantages and identify areas with limited access to public transport and services. By combining statistical and geographic techniques, we analyze demographic, socioeconomic, and travel data to spatially contextualize areas based on the social structure and understand the characteristics of population groups facing transportation challenges in the Porto Metropolitan Area. Cluster analysis results revealed four distinct clusters with homogeneous characteristics. In contrast, service area analysis assessed the public transport coverage to identify served zones, the population within these zones, and activities reached in the region. Our findings indicate that suburban and rural areas often lack access to public transport stops, aggravated by lower service frequencies, leading to high reliance on private cars for essential activities, such as work and education. Despite the good geographical coverage of rail and bus stops, urban and central–urban areas also suffer from inadequate service frequencies, impacting public transport usage. Improving service quality in high-demand areas could encourage greater public transport utilization and enhance accessibility. Identifying areas facing inequities facilitates targeted policy interventions and prioritized investments to improve accessibility and address mobility needs to access services effectively.

Keywords:

accessibility; equity; sociodemographic information; cluster analysis; activity destinations; public transportation demand

1. Introduction

Urban mobility is essential for attaining sustainable and economic development by ensuring people’s access to transport services and promoting social cohesion [1,2]. Sustainable and equitable transportation solutions are essential to ensure the fair distribution of transport services among diverse population groups to meet the increasing demand and need for mobility in metropolitan regions due to socioeconomic activity and urbanization. Effective accessibility planning for transport systems and urban areas can enhance efficiency and provide better access to opportunities and activities [3]. In this context, public transportation (PT) stands out as a crucial element in offering ecologically friendly and efficient alternatives to private vehicles (PVs), such as cars or motorbikes, for mobility in cities and metropolitan areas, especially to the underserved cohorts of society [4,5].

Ensuring equitable access to PT services remains a significant challenge, particularly in socioeconomically disadvantaged neighborhoods and rural areas characterized by diverse demographics and lower socioeconomic profiles. In these regions, a substantial portion of the population lacks PV ownership and relies heavily on the accessibility and availability of PT services [6,7].

Despite the importance of PT, significant policy and planning challenges exist in ensuring equitable access due to social disparities and barriers, which inhibit the effective utilization of PT for essential activities, such as work, education, and healthcare [8,9]. The central research problem addressed in this study is the identification and analysis of these barriers, which manifest into two main components.

First, areas with social inequities, where socioeconomic and demographic backgrounds create barriers that hinder equitable access to PT, lead to transport poverty among population groups due to income, social welfare dependency, unemployment, and household size. Their mobility limitations can stem from several factors, including the unavailability of transport modes, lack of car ownership, and distance from urban centers in areas with lower PT coverage. These challenges affect members of different social groups, leading to isolation and restricted access to essential activity spaces [10]. Second is the transport component, where limited access to PT stations, suboptimal service supply, and inadequate infrastructure can restrict access, leading to greater dependence on PVs to make essential trips [11].

Previous studies have highlighted the importance of integrating activity locations and PT planning for sustainable development and the need for a paradigm shift towards accessibility [12,13]. However, there is a notable gap in the literature addressing the challenges different demographic groups face in accessing PT services, particularly in suburban and rural areas [14,15]. PT services are predominantly concentrated in high-demand areas, providing greater accessibility and neglecting the need for PT access in peripheral regions [15]. This oversight is significant, as these areas rely more on PVs due to inadequate PT services [11]. Moreover, there is a lack of comprehensive integration of geographic and statistical analyses to assess accessibility, which is crucial for identifying areas with good geographic coverage but underlying inequities due to inadequate PT service frequencies [15].

Traditionally, equity assessments in PT have relied on various indices to assess social and transport disadvantages. These assessments identify areas with inequities by incorporating datasets that detail individual demographic and socioeconomic characteristics, providing a more tactical approach to developing policies and plans to reduce inequities [16]. However, for more strategic planning of the PT system, our study introduces a novel approach that combines statistical and geographic techniques to characterize and group regions based on similar demographic and socioeconomic groups and the magnitude of their transport disadvantage regarding limited access to PT and performing essential activities.

In this context, this paper extends the traditional methodology by employing cluster analysis to differentiate territorial areas based on their transport disadvantages and urban structure. Addressing the individual needs of populations is a complex and challenging task, requiring planners to balance various trade-offs to develop optimal, feasible solutions for everyone. This methodology offers a holistic view of the region by identifying specific clusters of areas with similar needs and disadvantages, consequently facilitating more effective PT planning and policy implementation across these clusters.

Consequently, for the Metropolitan Area of Porto (AMP), this study aims to characterize areas based on their accessibility to rail services and travel demand while identifying socially disadvantaged groups facing transport poverty. Following this, a service area analysis was performed to assess the current PT system coverage and equity conditions among clusters in the AMP. The analysis aimed to evaluate the potential population served and identify gaps in the use of PT and PVs for activities that are potentially accessible through PT. Furthermore, we explored the population’s PT needs for different daily life activities within the identified clusters using trip data from the 2017 AMP mobility survey (IMob) conducted by National Statistics Portugal (INE). We also determined the modal share of transport modes used for various activities across clusters. By examining the diverse transport needs among different groups within these clusters, we comprehensively assess the spatial and social inequities in PT accessibility. This allows policymakers to design and implement targeted interventions tailored to the cluster’s needs, thereby enhancing the overall efficiency and inclusivity of the PT system in the AMP. The insights gained from this study can provide a valuable framework for other cities facing similar challenges. They can help enhance their PT systems, promote equitable access, and foster more inclusive and sustainable urban environments.

The rest of the paper is organized as follows. Section 2 discusses the literature review concerning access to public transport and services to reduce inequalities and the use of cluster analysis. Section 3 and Section 4 discuss the study area and describe the data analysis and methodology. Section 5 presents the results of our analysis with key findings. Finally, in Section 6, the discussions and conclusions are carefully examined, providing insightful interpretations and implications based on the study’s results and limitations.

2. Literature Review

Grounded in the understanding that equal access to opportunities is indispensable regardless of social status, Litman and Niemeier et al. [1,17] have shown that equity in access to transportation can directly impact people’s economic and social opportunities. Likewise, Handy and Niemeier (1997) [18] discovered that people with access to PT were more likely to be employed and earn more money than those without. Further emphasizing the societal implications, Sanchez et al. (2004) [19] found that improving the accessibility of PT can help to reduce social inequality, especially for socially vulnerable groups

Identifying disadvantaged groups and associated characteristics becomes crucial, as multiple studies have already emphasized the importance of selecting demographic and socioeconomic variables that accurately represent the demand and needs for PT [20,21,22]. In this context, the careful selection of variables and data collection methods assumes a central role in this process due to the comprehensive gathering of crucial information, including car usage, monthly income, and housing costs [23,24]. These variables not only delineate the nature of demand but also highlight specific groups that, due to social and economic disadvantages, heavily rely on PT. Jomehpour and Smith-Colin (2020) [25] identified specific societal groups reliant on PT for travel, including students, teenagers, older adults, low-income households, individuals without car ownership, the unemployed, socially isolated individuals, and those with physical impairments. Currie (2004) and Fransen et al. (2015) [26,27] included the accessibility parameters along with income, elderly population, and population with welfare benefits to define PT needs.

Recent studies have explored the interrelation between travel behavior and sociodemographic factors and how they influence the propensity to use PT. Abdullah et al. (2021) [28] investigated the modal choices, revealing a significant influence of sociodemographic factors, such as the availability of PVs and low-income backgrounds. The owners of PVs were less inclined to use public modes than non-owners, while individuals from low-income backgrounds showed a greater reliance on PT. Ali and Abdullah (2023) [29] investigated social exclusion in PT usage, highlighting the importance of incorporating sociodemographic factors, such as disabilities, gender, income, and car ownership, in transport planning. Furthermore, Hasan et al. (2023) [30] underscored significant predictors of PT accessibility, such as service frequency, network coverage, proximity to trip origin/destination, and traffic congestion, emphasizing the need for tailored urban solutions.

Despite the significance of these variables, the current literature points to a gap in understanding and addressing spatial and social disparities in accessibility to PT services [31,32,33]. Golub and Martens (2014) [34] underscored the importance of incorporating demographic variables in accessibility analyses to inform planning and investment strategies, aiming to mitigate disparities between social disadvantages and accessibility via PT at a localized scale. Similar approaches have been adopted to assess inequities in PT or activity access. These methods include developing accessibility indices and conducting gap analyses between accessibility and transport or social needs indices for census tracts or traffic analysis zones [26,35,36,37]. These indices were developed using various combinations of the aforementioned variables, tailored to the context of each study. They were then employed to identify each location in urban or suburban regions with varying levels of inequity.

However, policies aimed at reducing spatial and social inequities may not always apply to urban structures that do not fit the conventional model of disadvantaged central-city neighborhoods and distant, job-rich suburbs. Adaptive policies and tailored approaches designed for diverse geographical regions and population groups can help mitigate these inequities by ensuring equitable access to PT, even in these atypical urban settings [17]. Thus, spatial contextualization and sociodemographic segmentation at a localized level for larger regions are crucial for better understanding population needs in the planning context [38,39]. As PT services and infrastructures vary across regions—whether central, urban, suburban, or rural—these differences can strongly influence user needs and transport choices.

As evidenced by several studies, cluster analysis addresses the intricate landscape of accessibility and social patterns within the analytical domain, especially when dealing with extensive and complex datasets across different regions [40,41]. This study employed cluster analysis to contextualize the different regions within the AMP, supported by prior research [22,24,42,43]. By categorizing these regions based on their similarities and dissimilarities, this method simplified the data into interpretable clusters, each demonstrating unique characteristics, therefore ensuring reliable data estimation [44,45].

This study builds on previous research by integrating and expanding upon several key themes identified in the literature. In contrast to previous studies highlighting the importance of equitable access, typically focused on identifying transport disadvantages on a localized scale, our approach offers the advantage of a more holistic spatial segmentation. It encompasses a broader range of sociodemographic factors and accessibility conditions, employing cluster analysis to identify homogeneous areas with similar PT demands and activity needs within different spatial contexts.

Incorporating cluster analysis alongside service area analysis enhances our understanding of socially disadvantaged groups to assess the PT system’s effectiveness by examining its geographical coverage. By delineating the characteristics of urban, suburban, and rural areas, service area analysis provides a comprehensive view of the efficiency of PT services. This approach identifies the population segments served by PT and evaluates access to different activities through PT, highlighting specific service gaps within each cluster.

3. Study Area

The AMP is a compelling case study due to its significant economic, cultural, and demographic diversity within Portugal. As the second-largest metropolitan area in the country, following only Lisbon (the capital of Portugal), the AMP adds depth to the analysis of metropolitan transport dynamics. The coexistence of densely populated urban areas and extensive rural regions within the AMP offers a rich context for studying diverse transportation needs and solutions.

Porto’s pivotal role in connecting various parts of the region underscores its importance in metropolitan growth and development studies. Encompassing 17 municipalities across an expanse of 2040 km², it boasts a population of approximately 1.7 million residents, yielding a population density of 844 inhabitants per km². Notably, the AMP distinguishes itself from a predominantly urban populace [46]. This urban concentration further highlights the relevance of Porto’s transport systems in supporting metropolitan sustainability and development. The varied socioeconomic landscape of the AMP allows for a detailed exploration of how transit infrastructure impacts different demographic groups.

The AMP’s intricate multimodal PT system, including buses and trains, provides a comprehensive view of the challenges and opportunities in public transit, particularly issues of accessibility and integration. The ongoing extensions and introduction of new metro lines in the AMP exemplify the region’s commitment to sustainability, with its expanding network contributing to significant environmental benefits.

The Metro of Porto is a vast light rail network that ranks among the largest in Europe. Spanning a total distance of 67 km and featuring six lines, it traverses seven municipalities within the AMP. Its fleet, comprising 102 vehicles, facilitates a transport capacity of 9000 individuals per hour on each line. With 82 stations, including 14 underground stations and 7.7 km of tunnels, the metro system significantly contributes to an annual reduction of 55,000 tons of CO₂ emissions, avoiding the circulation of 12,000 cars [47].

The PT network in the AMP serves as a crucial transport mode for the region, establishing key connections to other areas of the country and Europe. Figure 1 presents the structure of the bus and rail station locations across the AMP, highlighting the integrated nature of the region’s PT infrastructure.

This network facilitates efficient travel between diverse municipalities within the AMP, thereby fostering local mobility. However, it is essential to acknowledge that the PT system in the AMP poses specific challenges related to accessibility and services. Some stations may lack the optimal frequency or flexibility, while others may not be fully accessible. Additionally, some carriages might operate on less frequent schedules misaligned with typical work or study hours. The bus network within the AMP is predominantly managed by private enterprises, aiming to provide extensive coverage across all 17 municipalities in the region. This network is a complementary PT alternative to the metro and trains. Buses play a vital role in connecting urban and suburban areas in the AMP, seamlessly integrating with various transport modes to ensure a smooth transition between different forms of travel.

Nevertheless, it is crucial to highlight that bus stops in the rural areas of the AMP are often situated in less accessible locations, usually found outside of villages [48]. This presents a challenge for individuals, as these stops may lack clear signage, making them more difficult to locate. Furthermore, bus frequencies in rural areas are typically lower than in urban areas, complicating access to essential services and opportunities and challenging individuals who must commute for work, study, or other daily activities beyond their residential areas. Considering these challenges, the accessibility of PT in the AMP emerges as a critical factor for the region’s sustainable development.

4. Data Analysis and Methodology

4.1. Walking Accessibility and Selected Variables for Cluster Analysis

To have an overall idea of access, we first calculated the walking accessibility for the rail services. The aim is to develop a measure for the lowest possible spatial scale to accurately assess the existing levels of accessibility of the rail and metro services in the PT system for the available potential demand, i.e., the population groups. Gravity measures were used to calculate these accessibility indexes for the most common and significant active modes, i.e., walking, as all trips begin on foot. From place-based accessibility measures, gravity measures have the ability to capture spatial characteristics, like the location of individuals and stations, measured as the minimum travel distances or time between them, distance decay parameters associated with the mode, and the attraction at the destinations as a sum of opportunities available [49].

To calculate the accessibility index

A_{i}

, we started by calculating the minimum travel distances from every subsection (lowest level of geographical scale) to the nearest station using a geographical information system tool, ArcGIS. For this purpose, we developed a walking network dataset using AMP street data, the locations of stops, and their shapefiles obtained from the open-source website OpenStreetMap. The shapefiles for the subsections were obtained from INE’s official website. The gravity measure used also incorporates a mode decay parameter λ with a value of 0.084 derived using iterations for a walking distance of 1200 m or approximately 16 min of walking time and the supply (attraction) at the station in terms of frequency, as shown in Equation (1).

A_{i} = a_{j} \times e^{- {λ t}_{i j}}

(1)

where

A_{i}

is the accessibility level at subsection i, λ is the mode decay parameter,

t_{i j}

is the minimum travel time from subsection i to the nearest station j, and

a_{j}

is the attractiveness given by the ratio of the weekly frequency at station j and the maximum frequency of the network.

4.2. Census and Mobility Survey Data

The AMP area comprises 22,699 subsections, representing the smallest available census block scale. We gathered demographic and socioeconomic information about the population residing in these census blocks. This information was obtained from the census conducted by the INE at the subsection level [50]. Subsequently, we selected 14 social, demographic, and trip-related variables to address the area’s mobility needs. The demographic data include the total population, the female population, the dependent population (comprising children under 15 years and individuals over 65 years), the student population, the population reliant on monthly social security payments or pensions, and the unemployed population. Regarding household characteristics, we considered houses based on their size in square meters and the availability of parking spaces for two or more cars. The socioeconomic category encompasses certain variables, such as the average individual income, the monthly rent value, and the proportion of car usage, which serves as a proxy variable for car ownership.

In 2017, a comprehensive mobility survey was conducted in the AMP using digital forms and face-to-face interviews, yielding data from approximately 71,000 trips [51]. The survey was designed to capture nuanced insights into transportation behaviors, encompassing detailed information regarding trip purposes and the utilization of various transport modes, such as the following.

Trip Details: Participants specified the purpose of their trips, which included commuting to work, traveling to educational institutions, visiting healthcare facilities, shopping at retail outlets, engaging in leisure or sports activities, visiting family members, attending group activities, and various other miscellaneous purposes. This categorization of trip purposes enabled the identification of key travel needs and priorities within the metropolitan area.

Transport Modes Used: Respondents indicated their use of different transport modes, such as walking, cycling, motorcycles, private cars (both as drivers and passengers), and public transportation (buses and trains). By analyzing the preferred modes of transport, the survey provided valuable insights into the modal split and the factors influencing transportation choice.

As the mobility survey collected trip data for a population sample at the subsection level, we encountered the issue of missing values for some subsections. To estimate the trips for a subsection, we calculated the trips per individual factor using the available trip data and the population of the subsection. On a spatial scale, census data are defined for sections hierarchically above the subsections. Therefore, we assumed that sections are homogeneous in their characteristics so that we could estimate the trips for all subsections within a section using the trips per individual and the respective subsection’s population. Subsequently, all reported trips, along with their associated transport mode and purpose (i.e., activities), were geographically mapped to the subsections based on their trip’s origin and destination.

A summary of all of the variables used for the cluster analysis is provided in Table 1.

Using these variables, we employed the k-means clustering approach, which yields optimal results for larger datasets [52]. Data for 22,699 subsections were utilized for the clustering, which grouped the subsections with homogenous characteristics regarding demographic information, socioeconomic information, trips, and accessibility.

4.3. Service Area Analysis

Earlier, we measured walking access to rail stations using gravity measures incorporating rail frequency and walking times. Because multiple operators provide bus services in the AMP, obtaining comprehensive frequency or schedule information for most operators proved challenging. Consequently, we utilized service area analysis to estimate the demand served by PT, including bus and rail stops.

The geographical coverage of services can estimate access to PT. The total population living within this geographical coverage is considered to be served by PT. This geographical coverage is measured by a service/catchment area around the station, defined by a preferred walking distance or time. The most frequently used distances are 400 m around bus stations and 800 m around rail stations [53,54,55]. These distances can vary depending on the characteristics of the area, PT services, and people’s preferences [56]. The IMob survey found that people typically walk up to an average of 1200 m for daily activities [51].

Consequently, we considered a buffer size of 500 m for bus stations and 1200 m for rail stations. Service area polygons were drawn around the stations using street distances to provide the most precise distance estimation. Because population information is not spatially distributed across census blocks, the census block polygons were converted to centroid points. The intersection of service area polygons and the census block centroids identifies the census areas served by the stations. An overall estimation of the served or unserved census blocks was performed to analyze how well the system distribution covers the region.

Figure 2 shows the methodology framework adopted in this paper, including data collection, processing, methods, results, and recommendations.

5. Results

5.1. Cluster Analysis

While performing the clustering, outliers for some variables, such as trip origins and destinations, influenced the overall results. Therefore, we began by defining two clusters for these variables in the initial step. The results are presented in a bar chart in Figure 3.

In addition to the visual analysis presented in Figure 3, we also compiled the number of subsections for each cluster in Table 2.

Cluster 1 exhibits higher values across most variables than Cluster 2, indicating that Cluster 1 is characterized by a denser and more varied population with higher socioeconomic activity and accessibility. This cluster likely corresponds to highly populated urban areas. In contrast, Cluster 2 has lower values across most variables in the same categories, suggesting that it represents areas with a lower concentration of population and less socioeconomic activity. This cluster might correspond to suburban or rural areas with lower accessibility and fewer services. Results in Table 2 show that Cluster 1 comprises 2713 subsections classified as urban areas, while Cluster 2 includes the remaining 19,986 subsections, exhibiting a negative trend for all variables. This substantial number of subsections, representing approximately 88% of the total area, demonstrates a uniform trend in variables and warrants further exploration. To further analyze and understand the characteristics of these subsections, we conducted a second k-means cluster analysis focusing specifically on the data from the subsections comprising Cluster 2 identified in step 1.

In addition to the visual analysis presented in Figure 4, we also compiled the number of subsections for each cluster in Table 3.

In Figure 4, the first cluster shows a notable demographic composition, with a higher number of females and dependents. It has a slightly lower value for small houses than large houses and a higher prevalence of houses with parking spaces. Additionally, this cluster features lower income and rent values, and these areas generate a significant number of trips. The proportion of car usage is high, along with low access. These characteristics are typical of suburban areas outside of the city. Thus, we classified them as suburban areas, comprising 5802 subsections. This suggests a balanced demographic and socioeconomic profile without significant extremes.

Cluster 2 displays below-average values with slight variations in population across all of the selected groups with lower incomes. This cluster reports a lower number of trips, average car usage, and the lowest access. It also features a smaller number of small and large house sizes with parking spaces. The presence of 13,229 subsections, characterized by sparse population and low-density residential areas, indicates a vast area with minimal population. Consequently, we have classified these areas as rural.

Cluster 3 displays maximum walking access to rail services, higher incomes, and a more significant number of smaller houses. The high value for access shows that these areas are closer to the rail stations and along the rail lines, which are concentrated in the central areas of the municipality. Thus, we classified them as central areas. The lower values for population groups can be attributed to the smaller size of this cluster, which comprises 955 subsections.

After finalizing the clusters, a comprehensive summary of all selected variables was compiled for each cluster based on the variable type. The summation of population groups reveals that Cluster 1, classified as the urban area, exhibits the highest values across several key metrics: maximum population, many small-sized houses, maximum income, maximum monthly rents, and the highest access to rail services by walking. In contrast, the values for these variables progressively decrease in the suburban and rural areas. Figure 5 provides an overview of the 2-step k-means cluster analysis framework employed and illustrates the selection of the final clusters for the analysis.

5.2. Cluster Memberships

The finalized cluster membership associated with the subsections was then mapped to the AMP area, as depicted in Figure 6, illustrating the distribution of these clusters across the region. The areas in Cluster 4 are predominantly concentrated in the central municipality of Porto, with some patches distributed in the municipalities of Matosinhos and Vila Nova de Gaia. In contrast, the other clusters exhibit a more heterogeneous distribution. Clusters 1 and 2 display a mixed distribution across the municipalities, differing primarily in population density, and they are subsequently designated as urban and suburban clusters, respectively. The areas in Cluster 3 are dispersed towards the peripheries of the municipalities, situated far from the central areas, and they are classified as rural areas.

5.3. Service Area Polygons of Bus and Rail Stations

Using the ArcGIS Network Analyst tool, service areas were generated based on a walking distance of 1200 m around rail stations and 500 m around bus stations, as shown in Figure 7 and Figure 8. These service areas delineate the geographical coverage of the PT system and the served areas within the subsections that constitute the AMP region.

Figure 7 illustrates the geographical distribution of rail stations and their corresponding service areas. Yellow dots scattered across the map represent the locations of rail stations, while the surrounding, brown-shaded areas indicate their service areas. Light-blue polygons represent the AMP subsections. The western and central parts of the AMP, especially along the coastline and within central urban areas, have a high concentration of rail stations, indicating a well-developed rail network in these regions.

The rail network primarily serves a small area of the AMP, following a radial distribution pattern. The dense service areas around the central part of the map indicate that central municipalities have the most extensive coverage, benefiting from a higher number of stations and robust rail service. These service areas also extend along the rail lines around dispersed stations, indicating low to moderate coverage. Conversely, the eastern, southern, southeastern, and northern municipalities have significant areas without coverage and lack access to rail stations.

The distribution of bus stops in the AMP region is widespread and scattered, with a higher concentration in the central and western municipalities, as shown in Figure 8. The red-shaded areas surrounding the bus stops denote the bus service areas, indicating regions within the operational range of the buses, meaning that residents within these areas have access to bus services. The coverage analysis can be categorized into high-density areas and peripheral areas. High-density areas encompass the central, urban, and some suburban regions, characterized by a dense network of bus stops and extensive bus service areas. In contrast, peripheral areas extend into the eastern, southern, southeastern, and northern peripheries, with fewer bus stops, indicating moderate to low coverage. Residents in these peripheral areas have limited access to bus services, potentially resulting in lower PT utilization and inequitable access to PT services.

Examining the three maps in Figure 6, Figure 7 and Figure 8 provides significant geographical insights into the distribution of rail and bus stops across the subsections and their cluster memberships. These maps indicate that rail and bus transport is concentrated in the most densely populated and urbanized areas, providing robust coverage where demand and needs are high, including Cluster 4 (central), Cluster 1 (urban), and parts of Cluster 2 (suburban). Peripheral and less populated regions, constituting Cluster 3, have sparser bus stops and smaller service areas with no access to rail, highlighting potential gaps in service that could affect residents’ mobility in these areas.

Integrating the coverage of the PT system with demographic information, trip activities, activity locations, and mode usage within the AMP subsections provides a comprehensive framework for analyzing how well the PT system serves different administrative regions. This method helps to identify service gaps and determine potential activities that can be performed using PT. It is also valuable for planning and improving PT services to achieve a modal shift.

5.4. Potential Demand for Public Transportation

To estimate the potential demand served by bus and rail, we considered each census block within the geographical coverage of the service areas as served. For this purpose, we used the intersection of the service areas and the centroids for every subsection, identifying the subsections within the service areas. The percentages of all population groups, house characteristics, and trips served by bus and rail, the average economic conditions given by car usage, income, and rent value, and the current conditions of walk access to the rail services were measured for every cluster. Table 4 summarizes the selected variables for the four clusters served by bus and rail.

Due to the rail network’s limited coverage and distribution across the region, with most services concentrated in a few municipalities, the population served by rail is relatively low compared to the bus network. Out of 22,699 census blocks, only 6399 (28%), representing a population of 655,040 (37%), are served by rail. In contrast, the bus service covers approximately 85% of the total population and a similar percentage of other disadvantaged groups. Figure 9 illustrates this information.

The monthly rent value and the average income of individuals served by rail exceed those served by bus and the overall average income in the AMP. People served by rail tend to have the lowest car usage, owing to rail’s better and faster mobility.

Figure 10a–d compare how the bus and rail serve the population groups within Clusters 1, 2, 3, and 4. Among these clusters, rail service is most prominent in Cluster 4, which is situated in the central urban zones of the municipalities and near the rail network. Cluster 4 has a higher proportion of residents with higher incomes and lower car usage, highlighting the effectiveness of rail in these areas. Clusters 1 and 2, representing urban and suburban areas, can also benefit from rail services.

Rail services exhibit lower coverage in suburban areas, leading to inequities in the distribution of access as they fail to serve approximately 70% of the population. Residents in Clusters 2 and 3, characterized by lower incomes compared to other clusters, have a higher average proportion of car usage, at 65% and 61%, respectively, for a considerable number of reported trips. This reliance on higher car usage in these areas is mainly due to the lack of access to PT services, exacerbating the financial burden on low-income groups and contributing to social disparities among the population groups. Conversely, bus services perform best in Clusters 1, 2, and 4, encompassing the AMP’s central, urban, and suburban areas.

Comparing car usage between bus- and rail-served clusters reveals that people served by rail generally rely less on cars, as rail provides better service and faster mobility. Incomes also vary across clusters. Clusters 1 and 2 show higher incomes for people served by rail than those served by bus, whereas Clusters 3 and 4 show higher incomes among people served by bus than those served by rail. The income difference for Clusters 2 and 3 is relatively low at around EUR 100 per month.

In summary, Clusters 2 and 3 face the highest inequality in access to rail and bus services. While well-distributed bus stops provide fair access for shorter trips, innovative policies are needed to plan and improve services in these areas, thus enhancing connectivity and mobility options. Addressing these gaps can help alleviate the reliance on car usage, particularly for longer trips.

5.5. Modal Share for Activities Within Clusters and the Potential of Public Transportation

Using trip locations, we assigned trips to their respective clusters and determined the number of trips for various activities within each cluster, as presented in Figure 11. Subsequently, we identified whether individuals could perform their activities using PT while accessing or regressing from the stations within walking distance.

Cluster 1 accounted for the highest number of activities, followed by Clusters 3, 2, and 4, with work and education being the predominant activities performed, while health-related activities were less frequent. Notably, other activities hold significant importance, as they include return-home trips, which are a key element as they mark the conclusion of the entire trip, as most trips in the survey end at home.

Regarding the modal share for the total trips reported, private cars dominate and have the highest usage, encompassing drivers and passengers. Walking, cycling, and motorcycles are the second most frequent modes reported for performing activities. Bus and rail are the least-used modes, as depicted in Figure 12. Cluster 1 shows a substantial distribution of trips across all transport modes, with the highest share for rail (51%), followed closely by bus (45%). In Cluster 2, motorized modes are predominantly utilized (27%), followed by active modes (21%) and buses (20%), with rail (10%) being the least used. Cluster 3 demonstrates a higher dependency on private cars (30%) compared to active modes (23%) and buses (21%), with fewer trips by rail (14%). Cluster 4 reports the highest number of trips made by rail (23%), followed by bus (22%) and active modes (19%), and the least by private cars (6%).

Within the clusters, car usage dominates across all activities, with the highest percentage observed for work, as illustrated in Figure 13a–d. In Cluster 1, PT usage is minimal; the bus mode has the highest share at 14% for health services, 10% for education and work, 9% for other activities, and less than 8% for family visits, leisure, and shopping. This percentage is even lower for the rail mode, with a maximum of 7% for health, 6% for work, 5% for education and other activities, and less than 3% for the remaining activities. These percentages drop further across all activities in Clusters 2 and 3, with a modal share of less than 3% for rail.

Cluster 3 shows bus modal shares similar to Cluster 1, with the highest percentage of 15% for health activities, 10% for education, and 6% for work. In Cluster 4, rail dominates with modal shares of 13% for work, 12% for education, and 10% for health and other activities. This trend aligns with the highest level of rail services in Cluster 4, which signifies that increased PT service provision correlates with higher use.

Through the intersection of the activity locations and the service areas for both rail and bus, we identified various activities potentially performed or reached via public transportation (PT), as illustrated in Figure 14a,b. In Cluster 4, people can potentially perform every activity, as they share the highest level of rail services in the centralized areas of AMP. In Cluster 1, although health-related activities are more likely to be served given the higher percentage, they constitute a small fraction. For the most significant activities, such as work, education, and other activities, the percentage is around 55%.

Clusters 2 and 3 have the lowest percentages of activities served at just above 30% for each activity. In contrast, a higher rate of activities can be performed using bus modes rather than rail, as buses provide more coverage due to the higher distribution of their stations and network. However, suburban and rural areas face challenges due to lower service levels and poor connectivity, limiting access to activities in these areas. Therefore, improving service levels, connectivity, and accessibility could significantly boost PT ridership.

6. Discussions and Conclusions

In this article, we calculated the current accessibility conditions of rail services using gravity measures. Subsequently, we identified the social structure and spatial patterns within the AMP, focusing on disadvantaged population groups. This analysis was based on their socioeconomic and demographic backgrounds, trips, and calculated accessibility levels. A practical strategy involving cluster analysis, specifically utilizing the k-means algorithm, was employed to group the areas into distinct clusters with homogeneous characteristics. The cluster analysis categorized the 22,699 AMP subsections into four clusters to comprehensively identify and understand the diverse landscape. These clusters were identified as central urban zones, urban areas, suburban areas, and rural areas, with the primary contributing factors being higher demographic density and accessibility. Using mobility survey data, we spatially referenced the reported trips for various activities to their geographical subsection locations and the corresponding clusters they belong to. Additionally, we examined the modal share for the available private vehicles and PT modes used for different daily activities.

The results revealed an inequitable distribution of rail services across Clusters 1, 2, and 3, leading to higher use of PVs and low PT usage, particularly rail. In contrast, centrally located Cluster 4 exhibited a slightly higher modal share and better service for daily activities, such as work, education, and health. Urban and central–urban areas, despite having a higher concentration of rail stations and comprehensive geographical coverage by the public transportation system, including rail and bus stops, experience inadequate service frequencies. Addressing these barriers and increasing the frequency of services can lead to higher PT utilization in these high-demand areas. Similarly, Brechan (2017) [57] underscored the crucial role of PT services, highlighting that even in urban areas with extensive geographic coverage of PT, insufficient service frequencies can significantly hinder public transport usage, thereby exacerbating inequities.

Suburban and rural areas face the highest inequality in access to rail and bus services, particularly in the peripheries. These areas lack spatial access to PT stops, further aggravated by lower service frequencies, leading to higher car usage. By targeting these areas for improved service delivery, policymakers can reduce reliance on PVs and enhance access to essential activities, such as work and education. This targeted approach ensures that public investment funds are allocated to areas where they can most effectively enhance accessibility and mobility [58]. Guzman et al. (2023) [59] described that proper investment in public transport services reduces travel costs and attracts new users, amplifying the benefits for existing and potential passengers.

Expanding integrated PT infrastructure to underserved suburban and rural areas can improve accessibility and reduce social disparities. Rail development offers faster mobility and fosters regional economic growth, while bus services enhance connectivity [60]. Reliable and accessible PT options that connect people to jobs, education, healthcare, and other vital services foster greater social inclusion and improve the quality of life for all residents [11]. Promoting equity is crucial to ensuring that all demographic groups, especially those who are historically underserved, have equitable access to essential services and opportunities. This can be achieved by prioritizing investments in areas with the greatest need and addressing service gaps disproportionately affecting specific populations [31,61]. Utilizing statistical cluster analysis to comprehend different demographic groups with diverse characteristics, alongside geographical techniques to assess gaps or inequities in PT service provision, can help explore their mode choices, travel behaviors, and interactions with the PT system.

Our study contributes to the literature by focusing on the geographic context to understand the challenges inherent in urban, suburban, and rural areas. This approach goes beyond merely analyzing individual characteristics and modal choices, providing a comprehensive examination of the unique challenges faced by the AMP population in accessing and utilizing PT services for their reported daily activities. This comprehensive perspective offers valuable insights into the diverse needs and obstacles present in different environments.

These findings have practical implications for policy and practice in the realm of PT and urban mobility. By leveraging these insights, policymakers can create a more inclusive, efficient, and sustainable PT system that better serves the diverse needs of the AMP population. This detailed analysis enables the implementation of specific interventions tailored to different social and demographic groups’ unique needs, ultimately leading to a more equitable and accessible transportation network. Based on our analysis, we propose several recommendations to improve public transport accessibility and reduce inequalities:

Policy implications: Our findings suggest that improving service quality in high-demand areas encourages PT usage and enhances accessibility. Identifying areas facing inequalities facilitates targeted policy interventions and effectively prioritizes investments to address mobility needs.
Increasing service frequencies: Enhance the frequency of PT services in urban and suburban areas to reduce dependence on private cars.
Expanding public transport coverage: Expand the PT network to include suburban and rural areas, thereby increasing accessibility to bus and rail stops for a larger population group, thus providing equitable access to services for individuals residing outside of urban centers.
Integration of transport modes: Promote the integration of different transport modes (e.g., bicycles, buses, trains) to facilitate transitions between different forms of travel, thus improving connectivity.
Social inclusion policies: Develop inclusive transport policies that consider the needs of different social groups, especially those who heavily rely on public transport.

Such improvements contribute to broader sustainability goals by reducing traffic congestion and environmental impact, creating a more sustainable and livable urban environment. Moreover, the methodologies and findings from this study apply to other cities facing similar challenges in PT accessibility and equity. By adapting our approach to different urban contexts, policymakers in other regions can design targeted interventions to address the specific needs of their populations.

Although this approach holds promise for policymakers as it provides more nuanced insight into accessibility across regions, the presented work has some limitations due to the availability of PT routes, services, and frequency data. These limitations in the data and the analysis provide avenues for future research through the following tasks:

Incorporating frequencies at bus stops: Include data from multiple service providers to better estimate the supply of bus services.
Including the new and extended rail lines: Integrate the new and extended rail lines into the analysis by considering both under-construction metro lines and planned extensions of existing rail lines to evaluate the evolving rail infrastructure and its potential impact on PT accessibility.
Determining potential activities for reported trips: Advance this work to identify potential activities reachable through PT, considering key activity locations, such as schools, hospitals, public places, and shopping centers.
Evaluating PT service delivery: Assess PT service effectiveness in accessing essential destinations to provide a comprehensive understanding of PT performance and areas for improvement.

Efforts to improve public transportation accessibility and services should aim to meet the needs of diverse activities and address inequities across the entire social structure, thereby enhancing the transport planning and policymaking process.

Author Contributions

Conceptualization, M.S. and A.C.; methodology, M.S., H.R. and A.C.; software, M.S. and H.R.; validation, M.S.; formal analysis, M.S. and H.R.; data curation, M.S., H.R. and S.F.; writing—original draft preparation, M.S. and H.R.; writing—review and editing, A.C. and S.F.; visualization, M.S.; supervision, A.C. and S.F.; funding acquisition, M.S. and A.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Fundação para a Ciência e a Tecnologia (FCT), Portugal, through grant 2020.05098.BD, and Base Funding allocated by the FCT/MCTES (PIDDAC) to CITTA—Research Centre for Territory, Transports and Environment (UIDB/04427/2020).

Institutional Review Board Statement

This study does not require ethical approval.

Informed Consent Statement

Not applicable.

Data Availability Statement

The survey participants did not give written consent for their data to be shared publicly; therefore, due to the sensitive nature of the research, supporting data are unavailable. Interested parties wishing to access the data are kindly requested to contact Statistics Portugal (INE).

Acknowledgments

We would like to acknowledge Statistics Portugal (INE) for providing us with the data used in this study. In all cases, their commitment reflected a genuine desire to improve the mobility services of the Porto Metropolitan Area (AMP) for all people.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Litman, T. Evaluating Transportation Equity: Guidance for Incorporating Distributional Impacts in Transportation Planning. Vic. Transp. Policy Inst. Vic. 2014, 8, 3–40. Available online: https://nacto.org/wp-content/uploads/2015/07/2014_Litman_Evaluating-Transportation-Equity.pdf (accessed on 10 November 2023).
Mavoa, S.; Witten, K.; McCreanor, T.; O’Sullivan, D. GIS based destination accessibility via public transit and walking in Auckland, New Zealand. J. Transp. Geogr. 2012, 20, 15–22. [Google Scholar] [CrossRef]
Miller, P.; de Barros, A.G.; Kattan, L.; Wirasinghe, S.C. Public transportation and sustainability: A review. KSCE J. Civ. Eng. 2016, 20, 1076–1083. [Google Scholar] [CrossRef]
Al Suleiman, S.; Cortez, A.; Monzón, A.; Lara, A. How to improve public transport usage in a medium-sized city: Key factors for a successful bus system. Eur. Transp. Res. Rev. 2023, 15, 47. [Google Scholar] [CrossRef]
Rocha, H.; Lobo, A.; Tavares, J.P.; Ferreira, S. Exploring Modal Choices for Sustainable Urban Mobility: Insights from the Porto Metropolitan Area in Portugal. Sustainability 2023, 15, 14765. [Google Scholar] [CrossRef]
Berg, J.; Ihlström, J. The importance of public transport for mobility and everyday activities among rural residents. Soc. Sci. 2019, 8, 58. [Google Scholar] [CrossRef]
Lucas, K. Transport and social exclusion: Where are we now? Transp. Policy 2012, 20, 105–113. [Google Scholar] [CrossRef]
Lei, T.L.; Church, R.L. Mapping transit-based access: Integrating GIS, routes and schedules. Int. J. Geogr. Inf. Sci. 2010, 24, 283–304. [Google Scholar] [CrossRef]
Rocha, H.; Filgueiras, M.; Tavares, J.P.; Ferreira, S. Public Transport Usage and Perceived Service Quality in a Large Metropolitan Area: The Case of Porto. Sustainability 2023, 15, 6287. [Google Scholar] [CrossRef]
Farber, S.; O’Kelly, M.; Harvey, J.; Miller, H.J.; Tijs, T. Measuring segregation using patterns of daily travel behavior: A social interaction based model of exposure. J. Transp. Geogr. 2015, 49, 26–38. [Google Scholar] [CrossRef]
Farrington, J.; Farrington, C. Rural accessibility, social inclusion and social justice: Towards conceptualisation. J. Transp. Geogr. 2005, 13, 1–12. [Google Scholar] [CrossRef]
Bertolini, L.; le Clercq, F.; Kapoen, L. Sustainable accessibility: A conceptual framework to integrate transport and land use plan-making. Two test-applications in the Netherlands and a reflection on the way forward. Transp. Policy 2005, 12, 207–220. [Google Scholar] [CrossRef]
Yang, R.; Liu, Y.; Liu, Y.; Liu, H.; Gan, W. Comprehensive public transport service accessibility index-a new approach based on degree centrality and gravity model. Sustainability 2019, 11, 5634. [Google Scholar] [CrossRef]
Blumenberg, E.A.; Shiki, K. How Welfare Recipients Travel on Public Transit, and Their Accessibility to Employment Outside Large Urban Centers. Calif. Digit. Libr. 2003, 57, 00941228. Available online: https://escholarship.org/uc/item/04k2w2k7 (accessed on 11 June 2024).
Liu, Z.; Zhao, P.; Liu, Q.; He, Z.; Kang, T. Uncovering spatial and social gaps in rural mobility via mobile phone big data. Sci. Rep. 2023, 13, 6469. [Google Scholar] [CrossRef] [PubMed]
Bokhari, A.; Sharifi, F. Public Transport Inequality and Utilization: Exploring the Perspective of the Inequality Impact on Travel Choices. Sustainability 2024, 16, 5404. [Google Scholar] [CrossRef]
Niemeier, D.A. Accessibility: An evaluation using consumer welfare. Transportation 1997, 24, 377–396. [Google Scholar] [CrossRef]
Handy, S.L.; Niemeier, D.A. Measuring accessibility: An exploration of issues and alternatives. Environ. Plan. A 1997, 29, 1175–1194. [Google Scholar] [CrossRef]
Sanchez, T.W.; Stolz, R.; Ma, J.S. Inequitable effects of transportation policies on minorities. Transp. Res. Rec. 2004, 1885, 104–110. [Google Scholar] [CrossRef]
Boisjoly, G.; Serra, B.; Oliveira, G.T.; El-Geneidy, A. Accessibility measurements in São Paulo, Rio de Janeiro, Curitiba and Recife, Brazil. J. Transp. Geogr. 2020, 82, 102551. [Google Scholar] [CrossRef]
Elkafoury, A.; Zagow, M.; Saeed, K.; Darwish, A.M. Model Willingness to Use Public Transport in the USA Based on Socio-Economic and Demographic Characteristics. Civ. Eng. Archit. 2023, 11, 1487–1497. [Google Scholar] [CrossRef]
Wang, C.H.; Chen, N. A GIS-based spatial statistical approach to modeling job accessibility by transportation mode: Case study of Columbus, Ohio. J. Transp. Geogr. 2015, 45, 1–11. [Google Scholar] [CrossRef]
Kabir, S.M. Methods of Data Collection. In Basic Guidelines for Research: An Introductory Approach for All Disciplines; First Chapter: 9; Book: Chittagong, Bangladesh, 2016; pp. 201–275. Available online: https://www.researchgate.net/publication/325846997_METHODS_OF_DATA_COLLECTION (accessed on 17 January 2024).
Kim, H.S.; Kim, E. Effects of public transit on automobile ownership and use in households of the USA. Rev. Urban Reg. Dev. Stud. 2004, 16, 245–262. [Google Scholar] [CrossRef]
Jomehpour Chahar Aman, J.; Smith-Colin, J. Transit Deserts: Equity analysis of public transit accessibility. J. Transp. Geogr. 2020, 89, 102869. [Google Scholar] [CrossRef]
Currie, G. Gap analysis of public transport needs: Measuring spatial distribution of public transport needs and identifying gaps in the quality of public transport provision. Transp. Res. Rec. 2004, 1895, 137–146. [Google Scholar] [CrossRef]
Fransen, K.; Neutens, T.; Farber, S.; De Maeyer, P.; Deruyter, G.; Witlox, F. Identifying public transport gaps using time-dependent accessibility levels. J. Transp. Geogr. 2015, 48, 176–187. [Google Scholar] [CrossRef]
Abdullah, M.; Ali, N.; Javid, M.A.; Dias, C.; Campisi, T. Public transport versus solo travel mode choices during the COVID-pandemic: Self-reported evidence from a developing country. Transp. Eng. 2021, 5, 100078. [Google Scholar] [CrossRef]
Ali, H.; Abdullah, M. Exploring the perceptions about public transport and developing a mode choice model for educated disabled people in a developing country. Case Stud. Transp. Policy 2023, 11, 100937. [Google Scholar] [CrossRef]
Hasan, A.; Hasan, U.; AlJassmi, H.; Whyte, A. Transit Behaviour and Sociodemographic Interrelation: Enhancing Urban Public-Transport Solutions. Eng 2023, 4, 1144–1155. [Google Scholar] [CrossRef]
Delbosc, A.; Currie, G. Using Lorenz curves to assess public transport equity. J. Transp. Geogr. 2011, 19, 1252–1259. [Google Scholar] [CrossRef]
Liu, J.; Meng, B.; Xu, J.; Li, R. Exploring Public Transportation Supply–Demand Structure of Beijing from the Perspective of Spatial Interaction Network. ISPRS Int. J. Geo-Inf. 2023, 12, 213. [Google Scholar] [CrossRef]
van Wee, B.; de Jong, T. Differences in levels of accessibility: The importance of spatial scale when measuring distributions of the accessibility of health and emergency services. J. Transp. Geogr. 2023, 106, 103511. [Google Scholar] [CrossRef]
Golub, A.; Martens, K. Using principles of justice to assess the modal equity of regional transportation plans. J. Transp. Geogr. 2014, 41, 10–20. [Google Scholar] [CrossRef]
Azmoodeh, M.; Haghighi, F.; Motieyan, H. Proposing an integrated accessibility-based measure to evaluate spatial equity among different social classes. Environ. Plan. B Urban Anal. City Sci. 2021, 48, 2790–2807. [Google Scholar] [CrossRef]
Jang, S.; An, Y.; Yi, C.; Lee, S. Assessing the spatial equity of Seoul’s public transportation using the Gini coefficient based on its accessibility. Int. J. Urban Sci. 2017, 21, 91–107. [Google Scholar] [CrossRef]
Liu, C.; Bardaka, E. The suburbanization of poverty and changes in access to public transportation in the Triangle Region, NC. J. Transp. Geogr. 2021, 90, 102930. [Google Scholar] [CrossRef]
Grisé, E.; El-Geneidy, A. Where is the happy transit rider? Evaluating satisfaction with regional rail service using a spatial segmentation approach. Transp. Res. Part A Policy Pract. 2018, 114, 84–96. [Google Scholar] [CrossRef]
Oostendorp, R.; Gebhardt, L. Combining means of transport as a users’ strategy to optimize traveling in an urban context: Empirical results on intermodal travel behavior from a survey in Berlin. J. Transp. Geogr. 2018, 71, 72–83. [Google Scholar] [CrossRef]
Mohri, S.S.; Mortazavi, S.; Nassir, N. A clustering method for measuring accessibility and equity in public transportation service: Case study of Melbourne. Sustain. Cities Soc. 2021, 74, 10324. [Google Scholar] [CrossRef]
Wang, Z.; Han, Q.; de Vries, B. Land Use/Land Cover and Accessibility: Implications of the Correlations for Land Use and Transport Planning. Appl. Spat. Anal. Policy 2019, 12, 923–940. [Google Scholar] [CrossRef]
Ivan, I.; Horak, J.; Fojtik, D.; Inspektor, T. Multidimensional Evaluation of Public Transport Accessibility. In Dynamics in GIscience; GIS OSTRAVA 2017. Lecture Notes in Geoinformation and Cartography; Ivan, I., Horák, J., Inspektor, T., Eds.; Springer: Cham, Switzerland, 2018; pp. 149–164. [Google Scholar] [CrossRef]
Tiznado-Aitken, I.; Muñoz, J.C.; Hurtubia, R. Public transport accessibility accounting for level of service and competition for urban opportunities: An equity analysis for education in Santiago de Chile. J. Transp. Geogr. 2021, 90, 102919. [Google Scholar] [CrossRef]
Everitt, B. Cluster analysis. Qual. Quant. 1980, 14, 75–100. [Google Scholar] [CrossRef]
Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; Introduction Chapter; John Wiley & Sons, Inc: Hoboken, NJ, USA, 1990; pp. 1–67. [Google Scholar] [CrossRef]
Instituto Nacional De Estatística (AER). Available online: https://www.gee.gov.pt/pt/docs/doc-o-gee-2/estatisticas-regionais/nut-ii-nut-iii/norte/area-metropolitana-do-porto (accessed on 8 January 2024).
Porto Metro: What You Need to Know. Available online: https://www.beportugal.com/porto-metro/ (accessed on 12 November 2023).
Ribeiro, J.; Fontes, T.; Soares, C.; Borges, J.L. Accessibility as an indicator to estimate social exclusion in public transport. Transp. Res. Procedia 2021, 52, 740–747. [Google Scholar] [CrossRef]
Neutens, T.; Schwanen, T.; Witlox, F.; de Maeyer, P. Equity of urban service delivery: A comparison of different accessibility measures. Environ. Plan. A 2010, 42, 1613–1635. [Google Scholar] [CrossRef]
Instituto Nacional de Estatística. Available online: https://censos.ine.pt/xportal/xmain?xpid=CENSOS&xpgid=censos2011_apresentacao (accessed on 12 November 2023).
Instituto Nacional de Estatística—Mobility and Functionality of the Territory in the Metropolitan Areas of Porto and Lisbon. 2017. Available online: https://www.ine.pt/xurl/pub/349495406 (accessed on 16 January 2023).
Steinley, D. K-means clustering: A half-century synthesis. Br. J. Math. Stat. Psychol. 2006, 59, 1–34. [Google Scholar] [CrossRef] [PubMed]
Bree, S.; Fuller, D.; Diab, E. Access to transit? Validating local transit accessibility measures using transit ridership. Transp. Res. Part A Policy Pract. 2020, 141, 430–442. [Google Scholar] [CrossRef]
Otsuka, N.; Delmastro, T.; Wittowsky, D.; Pensa, S.; Damerau, M. Assessing the accessibility of urban nodes: The case of TEN-T railway stations in Europe. Appl. Mobilities 2019, 4, 219–243. [Google Scholar] [CrossRef]
Sun, Y.; Thakuriah, P. Public transport availability inequalities and transport poverty risk across England. Environ. Plan. B Urban Anal. City Sci. 2021, 48, 2775–2789. [Google Scholar] [CrossRef]
El-Geneidy, A.; Grimsrud, M.; Wasfi, R.; Tétreault, P.; Surprenant-Legault, J. New evidence on walking distances to transit stops: Identifying redundancies and gaps using variable service areas. Transportation 2014, 41, 193–210. [Google Scholar] [CrossRef]
Brechan, I. Effect of Price Reduction and Increased Service Frequency on Public Transport Travel. J. Public Transp. 2017, 20, 139–156. [Google Scholar] [CrossRef]
Mitropoulos, L.; Karolemeas, C.; Tsigdinos, S.; Vassi, A.; Bakogiannis, E. A composite index for assessing accessibility in urban areas: A case study in Central Athens, Greece. J. Transp. Geogr. 2023, 108, 103566. [Google Scholar] [CrossRef]
Guzman, L.A.; Cantillo-Garcia, V.A.; Oviedo, D.; Arellana, J. How much is accessibility worth? Utility-based accessibility to evaluate transport policies. J. Transp. Geogr. 2023, 112, 103683. [Google Scholar] [CrossRef]
Lai, Q. The Transportation Infrastructure and Regional Economic Growth—Evidence from Dongguan Humen Bridge. Mod. Econ. 2020, 11, 2055–2080. [Google Scholar] [CrossRef]
Pereira, R.H.M.; Schwanen, T.; Banister, D. Distributive justice and equity in transportation. Transp. Rev. 2016, 37, 170–191. [Google Scholar] [CrossRef]

Figure 1. Distribution of bus stops and rail stations across municipalities in the AMP (source: mapped using ArcGIS Pro 3.3.1).

Figure 2. Conceptual methodology framework.

Figure 3. k−means cluster analysis−step 1 (source: IBM SPSS 27).

Figure 4. k−means cluster analysis−step 2 (source: IBM SPSS 27).

Figure 5. Selection of final clusters using a 2-step k-means cluster analysis framework.

Figure 6. Clusters’ distribution across the AMP (source: mapped using ArcGIS Pro 3.3.1).

Figure 7. Subsections served by rail service areas (1200 m) (source: mapped using ArcGIS Pro 3.3.1).

Figure 8. Subsections served by bus service areas (500 m) (source: mapped using ArcGIS Pro 3.3.1).

Figure 9. The percentage of demographics and trips served by bus and rail in the AMP.

Figure 10. (a–d) Percentage of demographic groups and trips served by bus and rail.

Figure 11. Activities by category within each cluster.

Figure 12. Mode share by category within each cluster.

Figure 13. (a–d) Percentage of modal share for all activities within clusters.

Figure 14. (a,b) Percentage of potential activities by the services areas of rail and bus.

Table 1. Summary of selected variables for demographic, socioeconomic, and trip information and walking access.

Categories	Variables	Number of Subsections	Maximum Value	Average	Sum
Demographic Groups and House Characteristics	Population	22,699	1727	77.52	1,759,524
	Female Population	22,699	934	40.56	920,608
	Dependent Population	22,699	463	24.27	550,939
	Student Population	22,699	417	15.03	341,139
	Social Population	22,699	708	42.85	972,677
	Unemployed Population	22,699	134	5.03	114,238
	House Size < 50 m²	22,699	141	2.93	66,499
	House Size > 200 m²	22,699	74	2.20	49,952
	House Parking 2+	22,699	178	6.81	154,507
Socioeconomic Variables	Proportion of Car Usage	22,699	82.11	60.93	-
	Monthly Rent Value	22,699	396	192.22	-
	Income per individual	22,699	12,131	7307.75	-
Trips	Origins and Destinations	22,699	1010	9	-
Access to Rail	Walking Access	22,699	0.948	0.032	-

Table 2. Number of subsections in each cluster (step 1).

Cluster 1	Cluster 2	Total Subsections
2713	19,986	22,699

Table 3. Number of subsections in each cluster (step 2).

Cluster 1	Cluster 2	Cluster 3	Total Subsections
5802	13,229	955	19,986

Table 4. Socioeconomic and demographic characteristics of bus and rail clusters.

Categories	Variables	Cluster 1		Cluster 2		Cluster 3		Cluster 4		All Clusters
Categories	Variables	Bus	Rail	Bus	Rail	Bus	Rail	Bus	Rail	Bus	Rail
k-means Clusters	Clusters (%)	90.16	48.10	82.25	27.94	73.01	20.02	98.32	86.28	78.49	28.19
Demographic Groups and House Characteristics	Total Population (%)	90.21	48.21	82.85	29.65	75.26	19.90	98.77	85.33	84.89	37.23
	Female Population (%)	90.52	48.94	83.05	29.99	75.45	20.11	98.77	85.56	85.20	37.87
	Dependent Population (%)	90.47	49.60	82.83	29.97	75.79	20.60	98.83	85.68	85.10	38.18
	Student Population (%)	90.15	46.73	82.93	29.04	75.08	19.08	98.90	85.52	85.03	36.38
	Social Population (%)	90.75	51.29	82.49	30.53	75.90	21.33	98.82	85.26	84.99	39.09
	Unemployed Population (%)	89.98	46.37	83.34	29.01	77.04	20.66	98.93	84.79	85.74	37.20
	House Size < 50 m² (%)	91.99	55.07	83.28	27.34	76.75	20.64	97.49	82.01	87.15	42.95
	House Size > 200 m² (%)	87.33	39.72	80.29	24.67	74.52	20.71	97.37	86.84	80.88	29.06
	House Parking 2+ (%)	86.33	39.36	78.93	22.93	73.14	16.46	97.49	87.70	79.79	27.01
Socioeconomic Variables	Average Proportion of Car Usage	58.68	54.80	64.57	62.63	61.04	56.25	46.48	45.57	60.89	56.20
	Average Monthly Rent Value	208.24	214.98	205.35	215.54	188.03	186.02	195.34	196.73	195.83	200.78
	Average Income per Individual	8923	9353	7644	7743	6835	6756	10,822	10,674	7548	8040
Trips	Origins and Destinations (%)	93.28	57.70	83.27	28.68	73.80	18.74	97.85	86.70	86.18	41.73
Access to Rail	Average Walking Access	0.091	0.149	0.023	0.048	0.016	0.044	0.222	0.239	0.039	0.091

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shafiq, M.; Rocha, H.; Couto, A.; Ferreira, S. A Clustering Approach for Analyzing Access to Public Transportation and Destinations. Sustainability 2024, 16, 6944. https://doi.org/10.3390/su16166944

AMA Style

Shafiq M, Rocha H, Couto A, Ferreira S. A Clustering Approach for Analyzing Access to Public Transportation and Destinations. Sustainability. 2024; 16(16):6944. https://doi.org/10.3390/su16166944

Chicago/Turabian Style

Shafiq, Mudassar, Hudyeron Rocha, António Couto, and Sara Ferreira. 2024. "A Clustering Approach for Analyzing Access to Public Transportation and Destinations" Sustainability 16, no. 16: 6944. https://doi.org/10.3390/su16166944

APA Style

Shafiq, M., Rocha, H., Couto, A., & Ferreira, S. (2024). A Clustering Approach for Analyzing Access to Public Transportation and Destinations. Sustainability, 16(16), 6944. https://doi.org/10.3390/su16166944

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Clustering Approach for Analyzing Access to Public Transportation and Destinations

Abstract

1. Introduction

2. Literature Review

3. Study Area

4. Data Analysis and Methodology

4.1. Walking Accessibility and Selected Variables for Cluster Analysis

4.2. Census and Mobility Survey Data

4.3. Service Area Analysis

5. Results

5.1. Cluster Analysis

5.2. Cluster Memberships

5.3. Service Area Polygons of Bus and Rail Stations

5.4. Potential Demand for Public Transportation

5.5. Modal Share for Activities Within Clusters and the Potential of Public Transportation

6. Discussions and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI