4.1. Clustering Pipeline Results
Initially, the process outlined in
Figure 3 was executed within R-Studio [
46]. The initial set of raw data consisted, in total, of 972 repeated-measure observations, as outlined in
Section 3.2. After the initial preprocessing, data scaling, one-hot encoding, and duplicate removal, a final total of 935 observations were retained, indicating that the majority of traveling conditions were unique, and duplicates were rare.
The contextual variables selected for the PCA-supported dimensionality reduction comprise the cost, travel time, frequency, and walking time alternatives, as well as the parameters of reliability, comfort, safety, cleanness and ticket type for the two public transport alternatives, amounting to a data frame of 14 features in total. PCA was conducted on the scaled and curated data and produced a dataset with considerable dimensionality reduction, which comprised 10 features in total.
DBSCAN was then applied to the PCA-reduced components. After a trial period of comparison of the outputs of various minimum point density and neighborhood size configurations and using the k-NN distance plot of
Figure 4 as a reference, the clustering results were obtained for a minimum point density of five core points and a neighborhood size of 2.4.
The selection of these parameters was a combination of empirical guidance, which dictates the selection of a neighborhood size closely before the sharp increase in gradient, and data-driven results. The final clustering was obtained through DBSCAN, which yielded 4 clusters in total. The summary of mean and standard deviation values per cluster profile is provided in
Table 4 and
Table 5.
Finally, the t-SNE algorithm was applied to visualize the cluster distribution across the feature space. Results appear in
Figure 5.
Based on the values of
Table 4 and
Table 5 and t-SNE mapping of
Figure 5, an in-depth interpretation can be obtained for the choice processes of the participants in the context of Larissa.
Cluster 1 contains observations considering that the present PT options have low reliability and comfort with comparatively high standard deviations, denoting perceived inconsistencies. Safety and cleanness are perceived as moderate. Preferred cost and walking time are moderate as well, while this cluster comprises conditions that are described by regular tickets and daily usage. Regarding mode usage, t-SNE shows a mixed output, but tram is favored overall within this cluster. It is thus likely that the presently perceived low comfort pushes members of this cluster towards a tram preference.
Cluster 2 contains observations considering that the present PT options have moderate-high reliability and comfort, with low sd. indicating consistency in their perception. Likewise, safety and cleanness are perceived as above average. Preferred cost and walking time are towards the lower range, while this cluster comprises conditions that are described by regular tickets as well. Mode use frequency is quite mixed. Visually, t-SNE shows a preference towards bus usage overall. Therefore, this cluster expresses users who prioritize costs and perceive existing services as reliable and comfortable, denoting their overall preference towards buses.
Cluster 3 contains observations considering that the present PT options have moderate-high reliability, comfort, safety, and cleanness with notable robustness. Preferred cost and walking time are modest but varying, while this cluster comprises conditions that are described by regular ticket daily users. Visually, t-SNE depicts this cluster as considerably tram-oriented. It is probable that the members of this cluster are placing their preference towards a future tram service that they believe is going to be even more comfortable and that will fit their present travel arrangements.
Cluster 4 contains observations considering that the present PT might have the highest reliability, comfort, safety and cleanness with the highest perceived robustness, while this cluster comprises conditions that are described by regular ticket daily users. This cluster comprises users who experience the best overall service but are willing to walk more and to pay slightly higher costs. They mostly favor trams as depicted by t-SNE.
4.2. Logit Modeling Estimation Results
This section presents the estimated parameters of the fixed and random parameter logit models. A total of four choice models were estimated based on the SP survey experiment. The models’ results are presented in this section, while the implications of these findings are discussed later in the paper. To achieve the aims of the study, a series of binary logistic models was developed, numbered as follows:
Fixed effects with generic coefficients;
Fixed effects with mode-specific coefficients;
Random parameter (mixed logit) with generic coefficients;
Random parameter (mixed logit) with mode-specific coefficients.
Table 6 illustrates the fixed effects model (Model 1) and shows the parameter estimates as well as the respective standard errors and
p-values for the analysis of the best fitting fixed effects model with generic coefficients, i.e., the same coefficients for all alternatives. The model generally showed a good fit based on the related model metrics. The value of the Log-Likelihood at zero (LL
0) was found to be −669.57, while the value of the Log-Likelihood of the converged model (LL
f) was estimated to be −486.73, leading to a McFadden R-squared value of 0.274, while the AIC value was 983.45. The utility function for this model is specified for the tram, as the bus has been selected as the reference category. The model summary indicates that the signs for most of the beta coefficients of the attributes in the SP experiment were significant and aligned with the prior hypotheses (passengers preferring the bus). As expected, all beta coefficients show negative signs, indicating that the cost of the ticket, travel time, waiting time, and walking time reduce the probability of selecting the tram.
As a next step, the Value of Time (VoT) for Tram is calculated by dividing the mean value of the beta coefficient of travel time by the cost of a tram ticket, and as such, a generalized estimate of VoT can be provided across the sample, considering the distribution of preferences. It should be noted that the Value of Time estimates are derived from the ratio of the means of time and cost coefficients under the assumed normal parameter distributions, but also when fixed effects models are examined. While this approach is commonly adopted in empirical applications, alternative specifications (e.g., lognormal distributions or simulation-based derivation of VoT distributions) could provide further insights and constitute a promising direction for future research.
Hence, VoTTram = (−0.069)/(−0.889) = 0.078 euros per minute of travel time, or 4.683 euros per hour. Similarly, the VoT for walking time and waiting time can be estimated and were found to be VoTWalk = 3.958 euros per hour and VoTWait = 1.594 euros per hour, respectively. This leads to a total out-of-vehicle Value of Time (OoV VoT) for the tram equal to 5.552 euros per hour. These results indicate that future tram users are willing to pay more to decrease walking time than to reduce waiting time. This difference could be attributed to factors such as physical effort and the inconvenience of walking longer distances compared to waiting at the PT stop.
As for the next fixed effects model with different utility functions and mode-specific coefficients (Model 2), the value of the Log-Likelihood at zero (LL
0) was found to be −884.38, while the value of the Log-Likelihood of the converged model (LL
f) was estimated to be only marginally better than Model 1 (−485.46). The respective McFadden R-squared value was 0.275, while the AIC value was 988.92, and, therefore, the fixed effects models showed similar goodness of fit overall. The summary results of Model 2 are illustrated in
Table 7 below.
Similarly, all the beta coefficients have negative signs, indicating that higher values of cost, travel time and walking time result in lower probability in selecting each mode. However, a few notable differences are identified. For instance, in the best-fitting model, the constant term for tram was found to be non-significant. The same applies to the waiting time for the tram, which appears not to influence tram choice. Firstly, the non-significance of the constant term for the tram may indicate that respondents’ preferences for choosing the tram are not influenced by inherent biases, but instead, their choices might depend more on specific attributes or situational factors.
Secondly, the non-significance of waiting time suggests that tram users may not perceive tram waiting time as a critical factor influencing their choice. It can reflect a high level of confidence in tram time schedules or also an expectation that the tram stops would be more accessible and convenient for waiting compared to the existing bus stops. This could be attributed to the fact that tram services are generally perceived as reliable, reducing the impact of waiting time on decision-making. In contrast, buses might be perceived as less predictable, making waiting time more important in bus-related decisions. However, this finding should be interpreted with care, as the tram service is not yet developed in the city of Larissa.
Following the previous procedure, the respective categories of VoT can be calculated for Model 2. For trams, the VoT when travel time and ticket cost were considered, was found to be 3.67 euros per hour, while for buses, it was found to be 5.71 euros per hour. The rest of the VoT estimations for Model 2 are illustrated in the related figures, which are placed at the end of the present Section of the paper.
The next two models discussed are the random parameters (mixed) logit models for mode choice. Starting from the model with generic coefficients (
Table 8), the next table presents the parameter estimates of the beta coefficients, along with the standard deviations of the random parameters and their respective standard errors and
p-values for the best-fitting model. For the parameter estimation of the model, the number of Halton draws for estimating the panel effect coefficients is set to 1000. Moreover, the random parameters were assumed to follow the normal distribution. The value of the Log-Likelihood of the best converged model (LL
f) was estimated to be −424.34, and the respective McFadden R-squared was found to be 0.366, while the AIC value was 868.68. This delivers significantly better results due to the integration of random parameters for the constant term and tram-related attributes in the model, as potential unobserved heterogeneity in mode choices is captured through this model specification.
It is also indicated that all the beta coefficients of the standard deviations of the random parameters were found to be significant, showing that it is meaningful to consider these variables as random and thus adopt mixed-effects models over fixed effects models. As for the VoTTram, it was found to be 4.15 euros per hour. In addition, the respective VoTs for waiting time and walking time for the tram were estimated to be 1.55 euros per hour and 3.43 euros per hour, with a total out-of-vehicle VoT of 4.98 euros per hour.
The final model that was estimated in the analysis was the mixed logit model with mode-specific coefficients (
Table 9). As in the previous mixed logit model, 1000 Halton draws were used, and a normal distribution was selected for the random parameters. Results indicated a good fit; however, this fit was relatively lower than the mixed logit model with generic coefficients, as the value of the Log-Likelihood of the converged model (LL
f) was estimated to be −467.94, while the McFadden R-square was 0.300. In this case, the Value of travel Time for the bus was also found to be higher than the tram (4.81 euros per hour and 3.261 euros per hour).
The next figures (
Figure 6,
Figure 7 and
Figure 8) show the VoT values for all models, including the total out-of-vehicle (OoV) VoT. For comparison purposes between the two transport modes, total OoV VoT was also calculated for non-significant variables.
Hence, Model 3 (Mixed logit model with generic coefficients) is determined as the best-fitting model. Model 3 yields consistent and sensible results, as well as similarities with the fixed effect model featuring mode-specific coefficients (Model 2).
4.3. Discussion and Policy Implications
The survey results reveal a slight preference for trams over buses, with a distribution of 54.63% in favor of trams and 45.37% for buses. This possibly shows that the local population does not overwhelmingly support either of the two PT options. The tram supporters, which are a slim majority, might perceive the new tram as a more attractive or convenient transportation option compared to the existing bus service in the city of Larissa, but in most models it was shown that they do not have an a priori preference for the tram mode, as the constant terms for trams were not found to be statistically significant in the models. This finding might reflect the public enthusiasm for modern infrastructure or dissatisfaction with the current bus system.
On the other hand, the sizeable minority probably shows skepticism towards new PT services, which seem to stem from current services that do not meet their expectations or needs (
Table 2), an opinion which may carry to new and future PT options. Additionally, some individuals may lack real-world experience with rail-based PT, which limits their understanding of the advantages that trams can provide over buses. However, this could still imply that existing bus routes may experience a ridership reduction of up to 50%.
Furthermore, the results in
Table 2 are important and highlight a relatively high level of overall satisfaction with buses, which might still be considered as having margins of improvement for full-scale operations. This finding might explain why there is a stronger acceptance of trams over buses. These variables were considered in all models; however, there was no statistically significant variance between the responses given, and thus no statistically significant results were obtained.
The Value of Time (VoT) metrics suggest that respondents assign lower values to trams compared to buses across all stages of their journeys. This could be attributed to the fact that buses may be less reliable than trams, leading passengers to value time more since delays can have greater consequences. On the other hand, if trams are perceived as comfortable and frequent, users may not feel the need to rush as much. Another explanation could be the fact that tram systems and routes are protected by the vehicular traffic, and so they can achieve more efficient and reliable services. Notably, the difference is most pronounced during out-of-vehicle stages, where the VoT exceeds 3 euros. These findings align with the results in
Table 2, highlighting that respondents prioritize aspects of their trips spent outside of a vehicle, such as waiting and walking. In another related study in the field [
59], travel time emerged as the most significant determinant of mode choice; however, that study encompassed not only public transportation but also cycling and walking, which may account for this discrepancy.
Moreover, the relatively high disutility associated with out-of-vehicle time highlights the importance of minimizing access, waiting, and transfer times in the design of tram systems. This can be achieved through appropriate stop spacing, improved pedestrian infrastructure to enhance accessibility, and also high-quality waiting environments. Such measures can significantly enhance the overall attractiveness of the system and should be considered in planning and design processes. The outputs of the present study thus outline a clear need for improvements in the associated infrastructure (such as walkable routes, amenities at PT stops, and optimal placement of those stops) as well as enhancements to services (including increased service frequency to reduce wait times).
Our estimated VoT values, particularly for the tram alternative, might be considered relatively low when compared to findings reported in the literature regarding public transport VoT estimates. However, relatively low VoT estimates are not uncommon for public transport [
60]. Furthermore, variations in values of travel time in Europe are expected and could be attributed to differences in socio-economic conditions (i.e., GDP per capita) and the type of data [
61]. In the present case, several factors may explain the comparatively lower VoT values. First, local income levels and broader economic conditions are likely to influence individuals’ willingness to pay for travel time savings. Secondly, the SP nature of the experiment may lead respondents to adopt more conservative trade-offs compared to real-world observed behavior, thus potentially leading to lower VoT values.
It is also noted that, as the tram system is not yet implemented in the city of Larissa, the current analysis relies on SP data, since it is a widely used approach for assessing hypothetical future transport services. In this context, respondents’ choices may partly reflect perceptions associated with new transport modes, including a degree of optimism towards improved service attributes, which may also contribute to the observed slight preference for trams. However, such effects are inherent to SP studies and should be considered when interpreting the results.
On another note, by conducting a more detailed analysis of the preferences between potential options of PT users, an assessment of whether trams could play a pivotal role in shifting individuals from car usage to PT is enabled. Within the context of PT development, it is a fundamental need that PT systems are designed to operate in a complementary manner, rather than in competition with one another, to ensure an integrated PT network. In that regard, the aim of policymakers should be to promote a shift from private car use to PT, thereby improving network efficiency, accessibility, and sustainability of PT. Gaining insight into these dynamics is crucial for optimizing service planning, maximizing resource utilization, and achieving an integrated and sustainable transport system that serves the needs of both passengers and operators and serves as a competitive alternative to cars.
The current study shows that the evidence presented does not necessarily warrant investment in a new tram line unless there is a clear demand for trams among current PT and car users. While some urban rail systems produce benefits and outweigh the investment costs, there is contradictory evidence regarding the effectiveness and potential success of urban rail systems and trams. For more information, the reader can refer to several related studies such as [
3,
62,
63,
64].
However, the results of our analysis point towards a pressing need to improve existing PT services, particularly by focusing on the out-of-vehicle components of PT journeys. To further substantiate this claim, it would be helpful to explore whether similarly sized cities in Europe have successfully implemented tram systems or opted for upgrades to their bus services instead. In that case, it would be beneficial to authorities if successful case studies of tramlines were closely examined (e.g., Strasbourg and Paris). For instance, the tramway line opened on the Marechaux boulevards in Paris in December 2006 was considered highly successful, as it not only attracted the users of the bus line that it replaced, but also drew, surprisingly, a significant number of subway passengers [
3].
It is also essential to acknowledge, however, that LRT systems and trams are not a panacea, especially with respect to road safety, since previous studies have highlighted road safety concerns associated with trams, as they are large and heavy vehicles which often have to operate within complex urban traffic environments [
14]. For that reason, although the present analysis did not consider perceived safety-related attributes in the SP survey, public authorities should consider road safety impacts when planning and evaluating new PT schemes, as correctly stated in a related study by Naznin et al. [
14].
Another possible alternative for the city of Larissa could be a new Bus Rapid Transit (BRT) line. However, including a BRT line in the present SP analysis was not possible due to a complete lack of related information. In addition, the high demands of such a system mean that if it fails to attract the city’s commuting population, it could become a costly and ineffective investment, as stated by an overview BRT study [
65]. In order to fully understand the success of the PT system (tram line), there is a need to develop separate Revealed Preference (RP) studies to understand travel behavior under existing conditions. Examining current bus use, private cars, and other transport modes could provide valuable insights into origin–destination (OD) patterns, trip characteristics, and passenger behavior.
Finally, the Value of Time estimates reported in this study are based on the ratio of the mean time and cost coefficients under normally distributed random parameters (in the mixed logit models). Future research could explore alternative distributional assumptions (e.g., lognormal specifications) and simulation-based approaches to examine the implied VoT distributions in greater analytical detail.
Overall, the present study contributes to both theoretical understanding and practical planning, supporting more informed investment decisions, demand forecasting, and user-centered system design. Moreover, the findings can inform broader discussions on transport sustainability, mode shift incentives, and the social acceptability of new transit technologies in comparable urban contexts.